arxiv-summary: AI-summarized AI papers

H-AES: Towards Automated Essay Scoring for Hindi

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract Natural Language Processing (NLP) has been used for Automated Essay Scoring (AES) in English language. AES in Hindi and other low-resource languages has not been explored. This study reproduces and compares state-of-the-art methods for AES in the Hindi domain. Classical feature-based Machine Learning (ML) and advanced end-to-end models, including LSTM Networks and Fine-Tuned Transformer Architecture, are employed....

Large Language Models Are State-of-the-Art Evaluators of Translation Quality

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract GEMBA is a GPT-based metric for assessing translation quality It works with and without a reference translation Four prompt variants were compared in two modes Seven versions of GPT models were investigated, including ChatGPT GPT 3.5 and larger models are needed for the method to work Results from WMT22’s Metrics shared task show state-of-the-art accuracy Results are valid for three language pairs Code and prompt templates used for experiments are publicly released Paper Content Introduction LLMs can be used for multilingual Q&A LLMs can be used to translate text between languages LLMs can differentiate good from bad translations GPT can be used for automated assessment of translation quality GPT can be used for system-level evaluation of translation quality Prompt variants Four distinct prompt types are experimented with: two scoring tasks and two classification tasks Two scoring tasks: one based on direct assessment, one based on scalar quality metrics Two classification tasks: one based on one-to-five stars ranking, one based on five discrete quality classes Two modes for each prompt type: one with access to a human reference, one without Scoring process GEMBA-DA, GEMBA-SQM, and GEMBAstars output scores in the range of 0-100, 1-5, and 0-4 respectively....

An Algorithm and Complexity Results for Causal Unit Selection

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract Unit selection problem aims to identify objects that exhibit desired behavior when subjected to stimuli Existing work focuses on bounding a specific class of objective functions Proposed algorithm for finding optimal units given a broad class of causal objective functions and a fully specified structural causal model Unit selection under this class of objective functions is $\text{NP}^\text{PP}$-complete Treewidth-based complexity bounds on proposed algorithm Paper Content Introduction Theory of causality based on two parallel hierarchies: information and reasoning Three levels of reasoning: associational, interventional and counterfactual Knowledge encoded as associational, causal and functional models Unit selection problem: selecting customers to target with an encouragement offer Four types of customers: responders, always-takers, always-deniers, contrarians Benefit function to score customers and identify most promising ones Contrast with classical loss functions Structured units: decisions, policies, people, situations, regions, activities Fully specified SCM to obtain point values for any causal objective function Computational problem of finding units that optimize causal objective functions Exact algorithm to solve unit optimization problem: Reverse-MAP Complexity of algorithm characterized by treewidth Counterfactual queries on structural causal models Structural causal models (SCMs) are used to define the unit selection problem Causal objective functions and unit selection Causal objective functions involve observational, interventional or counterfactual probabilities Goal is to find objects (units) that optimize the function Linear combination of counterfactual probabilities Unit variables are exogenous in the SCM The complexity of unit selection Unit selection is NP-PP-complete for the class of causal objective functions given in Equation (1)....

Information-Restricted Neural Language Models Reveal Different Brain Regions' Sensitivity to Semantics, Syntax and Context

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract Investigates brain regions involved in syntactic and semantic processing during speech comprehension Trained lexical and supra-lexical language models on text corpus with selectively removed information Assessed models’ ability to predict fMRI signal of humans listening to naturalistic text Found asymmetry between left and right hemispheres in sensitivity to syntactic and semantic variables Paper Content Introduction Understanding the neural bases of language processing has been a main research effort in the neuroimaging community for decades One open question is whether semantic and syntactic information are encoded and processed jointly or separately in the brain Early lesion studies suggested syntactic processing takes place in specialized brain regions Neuroimaging studies and simulation work have provided support for a modular view of language processing An opposing view has argued that semantics and syntax are processed in a common distributed language processing system Neuroimaging studies have used controlled experimental paradigms and naturalistic paradigms Neural language models have been increasingly employed in the analysis of data collected from ecological paradigms A central puzzle remains in the field: distributed networks for naturalistic stimuli vs....

Im2Hands: Learning Attentive Implicit Representation of Interacting Two-Hand Shapes

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract Presents Im2Hands, a neural implicit representation of two interacting hands Im2Hands can produce fine-grained geometry of two hands with high hand-to-hand and hand-to-image coherency Im2Hands models the occupancy volume of two hands using two novel attention-based modules Optional keypoint refinement module enables robust two-hand shape estimation from predicted hand keypoints Achieves state-of-the-art results in two-hand reconstruction Paper Content Introduction Modeling 3D shapes of two interacting hands is important for various applications Existing studies have focused on single-hand reconstruction Challenges include inter-hand collisions and mutual occlusions Few learning-based methods on two-hand shape reconstruction have been proposed Im2Hands is the first neural implicit representation of two interacting hands Im2Hands produces two-hand meshes with an arbitrary resolution Im2Hands learns output shapes with precise hand-to-hand and hand-to-image alignment Im2Hands consists of two novel attention-based modules Im2Hands is compared to existing two-hand mesh-based and single-hand implicit function-based reconstruction methods Related work Single-hand reconstruction methods use deep learning to reconstruct 3D keypoints, MANO parameters, or mesh vertex coordinates Few recent works use neural implicit functions for single-hand reconstruction Most existing methods for hand-object reconstruction use MANO topology-based mesh representations Few recent works consider neural implicit representations to model hand-objects Two-hand reconstruction is more challenging due to complex occlusions and deformations Few methods can directly reconstruct the dense surface of closely interacting two-hands Neural articulated implicit representation is used to model articulated objects Recent works use attention mechanisms and context-aware shape refinement steps for two-hand reconstruction Our occupancy-based method can learn resolution-independent hand surface with better image-shape alignment Im2hands: implicit two-hand function Im2Hands is a neural occupancy representation of two interacting hands....

HelixSurf: A Robust and Efficient Neural Implicit Surface Learning of Indoor Scenes with Iterative Intertwined Regularization

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract Recovery of scene geometry from multiview images is a challenge in computer vision research. Recent methods leverage neural implicit surface learning and differentiable volume rendering. Traditional multi-view stereo can recover geometry of scenes with rich textures. HelixSurf intertwines regularization from two strategies during learning process. HelixSurf is efficient and faster than existing methods....

Goal Driven Discovery of Distributional Differences via Language Descriptions

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract Mining large corpora is time-consuming for humans Formulated a new task, D5, to automatically discover differences between two large corpora Task input is a research goal and a corpus pair Output is a language description of how the corpora differ Built a D5 system and contributed a meta-dataset and proposed unified evaluation metrics Confirmed language models can use goals to propose relevant, novel, and significant discoveries System produces discoveries previously unknown to the authors on a wide range of applications Paper Content Introduction Processes of generating discoveries from large corpora are ad hoc and laborious Machine learning can potentially accelerate these discovery processes Formulated one family of these processes as an ML task with unified metrics and input-output space Task is goal driven discovery of differences between text distributions via language descriptions Input is a “problem” comprising a description of the research goal and a corpus pair Output is a “discovery” represented as a natural language predicate Evaluate a discovery using two categories of criteria: validity and meaningfulness Curate OPEND5, a meta-dataset with 675 open-ended D5 problems Built a D5 system to tackle problems in OPEND5 System produces valid and meaningful discoveries in natural language as outputs Evaluated system and found it produces relevant hypotheses more often than baseline Automate discoveries, train better D5 systems, and analyze limitations of evaluation OPEND5 allows benchmarking, automation, analysis, and learning of D5 task Evaluation Evaluate system-generated discovery by determining if more samples from Corpus A satisfy the predicate Subjective judgement needed to determine if discovery is meaningful to research goal of understanding side effects Validity Requires output discovery h to be a truth predicate on a text sample Define T (h, x) as certainty that h is true on x Approximate T (h, x) by asking three Turkers and averaging responses Define validity V as mean of T (h, x) on subset from Corpus A and B Compute p-value for null hypothesis that V ≤ 0 by conducting t-test Ideal discovery should have large V value and small p-value Meaningfulness Valid discoveries may not be meaningful Relevance, novelty and significance can be used to rate how meaningful a discovery is Relevance is based on how related the discovery is to the research goal Novelty is based on how difficult it is to generate the discovery Significance is based on how beneficial it is to learn the discovery An ideal discovery should have high ratings for all three submetrics Method System maps from corpus pair and research goal to set of natural language predicates Inspired by two-stage model of how humans discover patterns in data Propose hypotheses conditioned on research goal and subset of samples from corpus pair Validate each hypothesis to see if it is more often true on one corpus than the other Leverage research goal to propose more meaningful hypotheses Hypothesis proposer Prompted GPT-3 to propose hypotheses Included research goal in prompt to elicit meaningful hypotheses Hypothesis validator Hypotheses in H init are often invalid Use language model T to simulate Turkers’ judgement and approximate validity score V of hypothesis h Use FLAN-T5 to ask whether x satisfies h Collect additional Turker annotations to fine-tune FLAN-T5 Perform t-test to compare mean value of V (h, x) on research split of Corpus A and mean value on Corpus B Rule out hypotheses with p-value greater than 0....

Vid2Seq: Large-Scale Pretraining of a Visual Language Model for Dense Video Captioning

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract Introduce Vid2Seq, a multi-modal single-stage dense event captioning model pretrained on narrated videos. Augment language model with special time tokens to predict event boundaries and textual descriptions in same output sequence. Leverage unlabeled narrated videos for dense video captioning by reformulating sentence boundaries of transcribed speech as pseudo event boundaries. Vid2Seq model pretrained on YT-Temporal-1B dataset improves state of the art on dense video captioning benchmarks....

Language Is Not All You Need: Aligning Perception with Language Models

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract Introducing Kosmos-1, a Multimodal Large Language Model (MLLM) Trained from scratch on web-scale multimodal corpora Evaluated on a wide range of tasks without any gradient updates or finetuning Impressive performance on language understanding, generation, and OCR-free NLP Also performs well on perception-language tasks and vision tasks Cross-modal transfer from language to multimodal and from multimodal to language Introducing a dataset of Raven IQ test to diagnose nonverbal reasoning capability of MLLMs Paper Content Evaluation MLLMs can handle both language tasks and perception-intensive tasks Evaluate KOSMOS-1 on various types of tasks Evaluate perception-language capability of KOSMOS-1 under vision-language settings Iq test: nonverbal reasoning Raven’s Progressive Matrices is a test to evaluate nonverbal reasoning....

The ROOTS Search Tool: Data Transparency for LLMs

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract ROOTS is a 1.6TB multilingual text corpus. It is used to train BLOOM, the largest language model. ROOTS Search Tool is a search engine for the entire ROOTS corpus. It offers fuzzy and exact search capabilities. ROOTS is the largest corpus to date that can be investigated this way. The ROOTS Search Tool is open-sourced and available on Hugging Face Spaces....