Timezone: »

 
Few-Shot Learning Evaluation in Natural Language Understanding
Subhabrata Mukherjee · Xiaodong Liu · Guoqing Zheng · Saghar Hosseini · Hao Cheng · Ge Yang · Christopher Meek · Ahmed Awadallah · Jianfeng Gao
Event URL: https://openreview.net/forum?id=VhIIQBm00VI »

Most recent progress in natural language understanding (NLU) has been driven, in part, by benchmarks such as GLUE, SuperGLUE, SQuAD, etc. In fact, many NLU models have now matched or exceeded "human-level" performance on many tasks in these benchmarks. Most of these benchmarks, however, give models access to relatively large amounts of labeled data for training. As such, the models are provided far more data than required by humans to achieve strong performance. That has motivated a line of work that focuses on improving few-shot learning performance of NLU models. However, there is a lack of standardized evaluation benchmarks for few-shot NLU resulting in different experimental settings in different papers.To help accelerate this line of work, we introduce CLUES, a benchmark for evaluating the few-shot learning capabilities of NLU models. We demonstrate that while recent models reach human performance when they have access to large amounts of labeled data, there is a huge gap in performance in the few-shot setting for most tasks. We also demonstrate differences between alternative model families and adaptation techniques in the few shot setting. Finally, we discuss several principles and choices in designing the experimental settings for evaluating the true few-shot learning performance and suggest a unified standardized approach to few-shot learning evaluation. We aim to encourage research on NLU models that can generalize to new tasks with a small number of examples. Code and data for CLUES are available at https://github.com/microsoft/CLUES.

Author Information

Subhabrata Mukherjee (Microsoft Research)

Principal Researcher at Microsoft Research leading cross-org initiative for [Efficient AI at Scale]. Our focus is on efficient learning of massive neural networks for both model (e.g., neural architecture search, model compression, sparse and modular learning) and data efficiency (e.g., zero-shot and few-shot learning, semi-supervised learning). We develop state-of-the-art computationally efficient models and techniques to enable AI practitioners, researchers and engineers to use large-scale models in practice. Our technologies have been deployed in several enterprise scenarios including Turing, Bing and Microsoft 365. Honors: 2022 MIT Technology Review Innovators under 35 Semi-finalist (listed in 100 innovators under 35 world-wide) for work on Efficient AI.

Xiaodong Liu (Microsoft)
Guoqing Zheng (Carnegie Mellon University)
Saghar Hosseini (Microsoft Research)
Hao Cheng (Microsoft)
Ge Yang (Microsoft Research)
Christopher Meek (Microsoft Research)
Ahmed Awadallah (MICROSOFT RESEARCH)

I am passionate about using AI and Machine Learning to create intelligent user experiences that connect people to information. I lead a research and incubation team in Microsoft Research Technologies. Our work at the Language and Information Technologies team is focused on creating language understanding and user modeling technologies to enable intelligent experiences in multiple products. Our work has been shipped in several products such as Bing, Cortana, Office 365, and Dynamics 365. I have hands-on experience building and shipping state-of-the-art ML/AI algorithms. I also have experience building and managing world-class teams of scientists and engineers. My research interests are at the intersection of machine learning, language understanding, and information retrieval. A key part of my work involves using Machine Learning to model large-scale text and user behavior data with applications to intelligent assistants, search, user modeling, quality evaluation, recommendation and personalization. I received my Ph.D. from the department of Computer Science and Engineering at the University of Michigan Ann Arbor. I Invented, published, and patented new approaches in language understanding, information retrieval and machine learning. I published 60+ peer-reviewed papers in these areas and I am an inventor on 20+ (granted and pending) patents.

Jianfeng Gao (Microsoft Research, Redmond, WA)

More from the Same Authors