Zero-Shot Learners for Natural Language Understanding via a Unified Multiple Choice Perspective
Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract Proposed a new paradigm for zero-shot learners that is format agnostic Zero-shot learning aims to train a model on a given task such that it can address new learning tasks without additional training Converted zero-shot learning into multiple-choice tasks Added generalization ability to models and reduced number of parameters Achieved state-of-the-art performance on several benchmarks Model has 235M parameters, substantially smaller than state-of-the-art models Paper Content Introduction Remarkable advances in large-scale language models have improved a variety of tasks Zero-Shot Learning (ZSL) aims to predict labels on datasets from novel domains Most solutions use the prompt tuning framework Existing frameworks have a large number of parameters and require manual processing Proposed Unified Multiple Choice model (UniMC) has advantages of parameter updating and deployment Option-mask tokens are used to predict “yes” or “no” before each option Option MLM and Option Prediction methods are used to output desired options Performance of UniMC outperforms state-of-the-art baselines with a smaller model size Related work NLP tasks have diverse formats due to the emergence of datasets Recent research shows the need to unify formats T0 builds an application to map datasets into target templates FLAN groups datasets into 12 task clusters and designs 10 instruction templates Label-based tasks need to be unified, so MC formats are developed Label information Label semantic is an important information source for few-shot tasks....