arxiv-summary: AI-summarized AI papers

Zero-Shot Learners for Natural Language Understanding via a Unified Multiple Choice Perspective

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract Proposed a new paradigm for zero-shot learners that is format agnostic Zero-shot learning aims to train a model on a given task such that it can address new learning tasks without additional training Converted zero-shot learning into multiple-choice tasks Added generalization ability to models and reduced number of parameters Achieved state-of-the-art performance on several benchmarks Model has 235M parameters, substantially smaller than state-of-the-art models Paper Content Introduction Remarkable advances in large-scale language models have improved a variety of tasks Zero-Shot Learning (ZSL) aims to predict labels on datasets from novel domains Most solutions use the prompt tuning framework Existing frameworks have a large number of parameters and require manual processing Proposed Unified Multiple Choice model (UniMC) has advantages of parameter updating and deployment Option-mask tokens are used to predict “yes” or “no” before each option Option MLM and Option Prediction methods are used to output desired options Performance of UniMC outperforms state-of-the-art baselines with a smaller model size Related work NLP tasks have diverse formats due to the emergence of datasets Recent research shows the need to unify formats T0 builds an application to map datasets into target templates FLAN groups datasets into 12 task clusters and designs 10 instruction templates Label-based tasks need to be unified, so MC formats are developed Label information Label semantic is an important information source for few-shot tasks....

Deep Differentiable Logic Gate Networks

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract Research has focused on developing efficient neural network architectures Explored logic gate networks for machine learning tasks Networks comprise logic gates such as “AND” and “XOR” Difficult to learn logic gate networks as they are non-differentiable Proposed differentiable logic gate networks to allow for effective training Discretized logic gate networks achieve fast inference speeds Paper Content Introduction Neural networks have been researched to make computations faster and more efficient....

Utilizing supervised models to infer consensus labels and their quality from data with multiple annotators

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract Labels for real-world data are often given by multiple annotators. CROWDLAB is an approach to estimate a consensus label, a confidence score, and a rating for each annotator. CROWDLAB is based on simple weighted ensembling and utilizes a classifier model trained on the features of the examples. CROWDLAB provides superior estimates than alternative algorithms in evaluations on real-world data....

Point Transformer V2: Grouped Vector Attention and Partition-based Pooling

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract Point Transformer V2 model proposed to overcome limitations of previous work Group vector attention proposed, more effective than previous version Grouped weight encoding layer and position encoding multiplier proposed Lightweight partition-based pooling methods designed for better spatial alignment and efficient sampling Model achieves state-of-the-art performance on 3D point cloud understanding benchmarks Paper Content Introduction Point Transformer (PTv1) introduces self-attention networks to 3D point cloud understanding Point Transformer V2 (PTv2) improves upon PTv1 with novel designs Grouped vector attention with improved position encoding Efficient partition-based pooling scheme Improved position encoding scheme to utilize point cloud coordinates better Related works Image transformers use scaled dot-product self-attention and multi-head self-attention theory from NLP Point cloud understanding methods can be projection-based, voxel-based, or point-based Point cloud transformers use attention to process point clouds Point Transformer [1] performs local attention between each point and its adjacent points Point Transformer V2 proposed to improve effectiveness and efficiency of Point Transformer Point transformer v2 Problem formulation and background Problem formulation: 3D point cloud scene contains points with positions and features Goal of point cloud semantic segmentation: predict class label for each point Goal of scene classification: predict class label for each scene Local attention: attention works within subset of points, i....

Named Entity Recognition in Twitter: A Dataset and Analysis on Short-Term Temporal Shifts

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract Recent progress in language model pre-training has improved Named Entity Recognition (NER). NER has mainly been tested in well-formatted documents. Social media adds complexity due to its noisy and dynamic nature. A new NER dataset, TweetNER7, was constructed for Twitter. Language model baselines were provided and an analysis was performed. Three temporal aspects were analyzed: short-term degradation, fine-tuning strategies, and self-labeling....

GNM: A General Navigation Model to Drive Any Robot

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract Learning provides a powerful tool for vision-based navigation Combining data from multiple sources can train more powerful navigation models This paper studies how a general goal-conditioned model can be trained on data from multiple robots 60 hours of navigation trajectories from 6 robots were used to train the model The model was deployed on a range of new robots, including an underactuated quadrotor Training on diverse data leads to robustness against degradation in sensing and actuation Paper Content I....

Binding Language Models in Symbolic Languages

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract End-to-end neural approaches lack interpretability and robustness Binder is a training-free neural-symbolic framework that maps task input to a program Unified API of language model functionalities is used to extend grammar coverage GPT-3 Codex is used as the language model Few in-context exemplar annotations are used Binder achieves state-of-the-art results on WikiTableQuestions and TabFact datasets No training required, only uses dozens of annotations as in-context exemplars Paper Content Introduction Performance on natural language processing tasks is dominated by neural end-to-end systems Symbolic approaches produce explicit intermediate representations Symbolic approaches are interpretable and robust Coverage is limited by the grammar of the symbolic language Neural-symbolic approaches combine neural modules and symbolic languages Neural-symbolic approaches require human design and large training data BINDER is a training-free neural-symbolic framework that maps task inputs to an executable program BINDER requires few annotations and is more interpretable, scalable, and robust than end-to-end approaches Approach Binder framework BINDER framework is used to solve NLP tasks BINDER program is generated from natural language input and optional context Output answer is derived by executing BINDER program with interpreter Binder parsing Input natural language is parsed into a BINDER program BINDER program is an expression in a symbolic language that includes API calls API call is a function that accepts a question and context to be queried Output of API call is the answer to the question Output is represented as a variable compatible with the symbolic language grammar Binder execution Program Z is executed by a BINDER interpreter to derive the answer A....

GLM-130B: An Open Bilingual Pre-trained Model

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract GLM-130B is a bilingual (English and Chinese) pre-trained language model with 130 billion parameters. It is an attempt to open-source a 100B-scale model at least as good as GPT-3. The paper introduces the training process of GLM-130B, including design choices, training strategies, and engineering efforts. GLM-130B outperforms GPT-3 175B and ERNIE TITAN 3....

SHINE-Mapping: Large-Scale 3D Mapping Using Sparse Hierarchical Implicit Neural Representations

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract Traditional mapping methods have difficulty balancing memory consumption and accuracy. This paper proposes a 3D LiDAR-based mapping method using an octree-based hierarchical structure. The features are optimized with 3D measurements and a binary cross entropy loss. The mapping system is designed to prevent catastrophic forgetting. Experiments show that the proposed method is more accurate, complete, and memory-efficient than current methods....

Is Reinforcement Learning (Not) for Natural Language Processing?: Benchmarks, Baselines, and Building Blocks for Natural Language Policy Optimization

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract Introducing an open-source modular library, RL4LMs, for optimizing language generators with RL. Presenting the GRUE benchmark, a set of 6 language generation tasks supervised by reward functions. Introducing NLPO, an easy-to-use, performant RL algorithm. RL techniques are better than supervised methods at aligning LMs to human preferences. NLPO exhibits greater stability and performance than previous policy gradient methods....