Late last year, the Allen Institute for AI, the research institute founded by the late Microsoft cofounder Paul Allen, quietly open-sourced a large AI language model called Macaw. Unlike other language models that’ve captured the public’s attention recently (see OpenAI’s GPT-3), Macaw is fairly limited in what it can do, only answering and generating questions. But the researchers behind Macaw claim that it can outperform GPT-3 on a set of questions, despite being an order of magnitude smaller.
Answering questions might not be the most exciting application of AI. But question-answering technologies are becoming increasingly valuable in the enterprise. Rising customer call and email volumes during the pandemic spurred businesses to turn to automated chat assistants — according to Statista, the size of the chatbot market will surpass $1.25 billion by 2025. But chatbots and other conversational AI technologies remain fairly rigid, bound by the questions that they were trained on.
Answering questions
Built on UnifiedQA, the Allen Institute’s previous attempt at a generalizable question-answering system, Macaw was fine-tuned on datasets containing thousands of yes/no questions, stories designed to test reading comprehension, explanations for questions, and school science and English exam questions. The largest version of the model — the version in the demo and that’s open-sourced — contains 11 billion parameters, significantly fewer than GPT-3’s 175 billion parameters.
Given a question, Macaw can produce an answer and an explanation. If given an answer, the model can generate a question (optionally a multiple-choice question) and an explanation. Finally, if given an explanation, Macaw can give a question and an answer.
“Macaw was built by training Google’s T5 transformer model on roughly 300,000 questions and answers, gathered from several existing datasets that the natural-language community has created over the years,” the Allen Institute’s Peter Clark and Oyvind Tafjord, who were involved in Macaw’s development, told VentureBeat via email. “The Macaw models were trained on a Google cloud TPU (v3-8). The training leverages the pretraining already done by Google in their T5 model, thus avoiding a significant expense (both cost and environmental) in building Macaw. From T5, the additional fine-tuning we did for the largest model took 30 hours of TPU time.”