arxiv-summary: AI-summarized AI papers

ParaFormer: Parallel Attention Transformer for Efficient Feature Matching

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract Heavy computation is a bottleneck for deep-learning based feature matching algorithms. Existing lightweight networks cannot address classical feature matching tasks. This paper proposes two concepts: ParaFormer and a graph based U-Net architecture with attentional pooling. ParaFormer fuses features and keypoint positions and integrates self- and cross-attention. U-Net architecture and proposed attentional pooling reduce computational complexity....

Disentangling Linkage and Population Structure in Association Mapping

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract GWAS tests SNP markers to identify causal variants of a trait. Establishing a connection between the surrogate model and the true causal model. Population structure is accounted for in GWAS by modelling the variant of interest and not the trait. Environmental confounding can be partially corrected using genetic covariates. Paper Content Introduction GWAS identifies regions in the genome responsible for variation in a trait SNPs are tested for association with the trait SNPs are dense and widespread across the genome GWAS uses a marker-additive model (MAM) to estimate parameters MAM parameters have no direct causal interpretation This work considers a causal-additive model (CAM) with direct causal interpretation Setup Population membership is described by C i and M i which take values 0, 1, 2 Marker-additive model (MAM) includes marker effect size β and noise variable δ Marginal testing is used, where M i1 is the marker being tested Leave one chromosome out (LOCO) approach removes markers close to the variant being tested Causal-additive model (CAM) includes causal effect size α and noise variable ǫ Pritchard-Stephens-Donnelly (PSD) model describes population structure incorporating admixture Random mating is assumed, with haplotype frequencies and linkage disequilibrium (LD) parameters Linear projection of X respect to Y is used Genotype at different haplotypes are conditionally independent Goal is to characterize the estimand of the regression under the CAM Estimand β 1 (S i ) is a weighted average over β 1 (S i ) Main results GWAS is interested in the contribution of linkage with a physically proximal causal variant It is partially achievable to separate path (1) from path (2) under population-based design It is achievable under within-sibship design Population structure affects β 1,nocov in two ways: attenuates true signal and puts undesirable signals into estimand Weights of linkage term in Theorem 3....

Grounded Decoding: Guiding Text Generation with Grounded Models for Robot Control

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract LLMs can learn and leverage Internet-scale knowledge through pre-training with autoregressive models. LLMs are not suitable for settings with embodied agents due to lack of experience with the physical world, inability to parse non-language observations, and ignorance of rewards or safety constraints. Language-conditioned robotic policies can provide the necessary grounding for the agent to be correctly situated in the real world, but are limited by the lack of high-level semantic understanding....

UDAPDR: Unsupervised Domain Adaptation via LLM Prompting and Distillation of Rerankers

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract Information retrieval tasks require large labeled datasets for fine-tuning. Large language models can be used to generate large numbers of synthetic queries cheaply. Reranker models are used to fine-tune the synthetic queries. Boosts zero-shot accuracy in long-tail domains. Lower latency than standard reranking methods. Paper Content Introduction Neural IR has led to performance improvements on document and passage retrieval tasks Neural retrievers benefit from fine-tuning on large labeled datasets IR models can experience significant drops in accuracy due to distribution shifts UDAPDR is an efficient strategy for using LLMs to facilitate unsupervised domain adaptation of neural retriever models UDAPDR leads to large gains in zero-shot settings on a diverse range of domains UDAPDR uses a powerful and expensive LLM to create an initial set of synthetic queries These queries are used to train separate rerankers, which are distilled into a single Col-BERTv2 retriever UDAPDR only requires 1000s of synthetic queries to prove effective Code and synthetic datasets for UDAPDR will be publicly available Data augmentation for neural ir Generated datasets support domain adaptation in Transformer-based architectures LLMs used to improve IR accuracy in new domains via synthetic datasets Domain shift is the most pressing challenge for effective domain transfer Different types of domain shifts can be addressed with synthetic data and indexing strategies Pretraining objectives for ir Pretraining objectives can help neural IR systems adapt to new domains without annotations MLM and ICT are unsupervised approaches for helping retrieval models adapt to new domains BFS and WLP are unsupervised pretraining tasks that use sampled in-domain sentences and passages NVSM is an unsupervised pretraining task for news article retrieval Contrastive learning objective for unsupervised training of dense retrievers ICT paired with synthetic query data for domain adaptation Contrastive learning objective paired with unsupervised Promptagator strategy Unsupervised domain adaptation approach does not require any further pretraining Methodology UDAPDR strategy requires access to in-domain passages, but not queries or labels Goal is to generate large numbers of synthetic queries for passages Stage 1: X in-domain passages sampled from target domain, 5X synthetic queries generated using GPT-3 and 5 prompting strategies Stage 2: Y corpus-adapted prompts created, varying according to demonstrations Stage 3: Z queries generated with Flan-T5 XXL, quality filter applied Stage 4: Y rerankers trained from scratch, N best rerankers selected Stage 5: Multi-teacher distillation process used to create single ColBERTv2 retriever Stage 6: Domain-adapted ColBERTv2 retriever tested on evaluation set for target domain Experiments Models Leveraged Demonstrate-Search-Predict (DSP) codebase for experiments Used DeBERTaV3-Large as crossencoder after comparison experiments Used ColBERTv2 retriever for IR system Datasets Used LoTTE, NQ, and SQuAD for experiments NQ and SQuAD were part of Flan-T5’s pretraining datasets Wikipedia passages used in NQ and SQuAD were part of DeBERTaV3 and GPT-3’s pretraining datasets Multi-reranker domain adaptation UDAPDR accuracy is compared to two baselines in Table 1 Baseline 1 is a Zero-shot ColBERTv2 retriever with no distillation Baseline 2 is a Zero-shot ColBERTv2 retriever paired with a single non-distilled passage reranker, trained on 100K synthetic queries UDAPDR is far superior to Zero-shot ColBERTv2 across all domains Two settings of UDAPDR are competitive with or superior to Baseline 2 Query latency UDAPDR is highly effective Table 1 does not take query latency into account Table 2 reports latency evaluations Zero-shot ColBERTv2 has low retrieval latency Zero-shot ColBERTv2 has state-of-the-art accuracy UDAPDR has the best accuracy and same latency as Zero-shot ColBERTv2 Zero-shot ColBERTv2 + Reranker models come close, but with higher latency Impact of pretrained components UDAPDR uses 3 pretrained components: GPT-3, Flan-T5 XXL, and DeBERTaV3-Large Variants of UDAPDR were explored and results are summarized in Table 4 Primary setting of UDAPDR performs best Very competitive performance can be obtained without GPT-3 Flan-T5 XL can be used instead of Flan-T5 XXL DeBERTaV3-Base is still effective, but results in a 4....

Improved Segmentation of Deep Sulci in Cortical Gray Matter Using a Deep Learning Framework Incorporating Laplace's Equation

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract Developing tools for automated cortical segmentation requires topologically correct segmentations. Accurate cortical segmentation is difficult due to image artifacts and the highly convoluted anatomy of the cortex. A novel deep learning-based cortical segmentation method is proposed which incorporates prior knowledge about the geometry of the cortex. A loss function is designed which uses Laplace’s equation to penalize unresolved boundaries between tightly folded sulci....

StraIT: Non-autoregressive Generation with Stratified Image Transformer

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract Proposed a non-autoregressive generative model for high-quality image synthesis Leveraged the hierarchical nature of images to encode visual tokens into stratified levels Improved NAR generation and outperformed existing DMs and AR methods Achieved FID scores of 3.96 at 256*256 resolution on ImageNet without guidance Achieved FID of 3.36 and IS of 259....

R-U-SURE? Uncertainty-Aware Code Suggestions By Maximizing Utility Across Random User Intents

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract Large language models can predict structured text such as code. These models can make mistakes that users must fix or introduce subtle bugs. R-U-SURE is an approach to build uncertainty-aware suggestions based on a decision-theoretic model. R-U-SURE can be applied to different user interaction patterns without retraining the model. Paper Content Introduction Large language models can generate natural language and source code Used in developer assistance tools and services Tendency to guess or “hallucinate” unwanted outputs Can slow development and lead to undetected problems Automation bias can cause users to miss issues in automated systems Some parts of user intent can be predicted more easily than others Approach proposed to predict parts of generated programs that may need editing Approach uses a pretrained language model to generate a suggestion prototype Combinatorial optimization used to insert annotations into the prototype Indicators of model uncertainty or user-fillable “holes” can be included Utility-driven framework proposed to produce uncertainty-aware suggestions Dual decomposition used to optimize utility functions Variants of utility functions constructed to incorporate tree structure, account for deletions and insertions, and respond to uncertainty Demonstrated across three developer-assistance-inspired tasks Problem statement Problem of providing contextual, uncertainty-aware suggestions to assist users of ML-integrated tools Focus on assisting software development Not enough information to fully determine user’s intent Augment space of possible suggestions to account for uncertainty Insert visual markers into code-completion suggestion to draw attention to parts of suggestion user may wish to change Formalize intuition using decision theoretic framework Maximize utility of suggestion given context and uncertainty about user’s goal Approach We do not have access to the distribution in Equation (1)....

Finding the right XAI method -- A Guide for the Evaluation and Ranking of Explainable AI Methods in Climate Science

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract XAI methods can explain predictions of DNNs XAI methods have been applied in climate science Missing ground truth explanations complicate evaluation and validation of XAI methods This work introduces XAI evaluation in the context of climate research Evaluation properties assessed: robustness, faithfulness, randomization, complexity, localization MLP and CNN trained to predict decade based on temperature maps Multiple XAI methods applied and performance quantified for each evaluation property XAI methods Integrated Gradients, Layer-wise relevance propagation, and InputGradients show considerable robustness, faithfulness, and complexity Explanations using input perturbations do not improve robustness and faithfulness Paper Content Introduction Deep learning is used in climate science for tasks such as nowcasting, monitoring, forecasting, model enhancement, and upsampling of satellite data Deep neural networks are considered a black box and lack transparency Explainable artificial intelligence (XAI) can validate DNNs and provide researchers with new insights XAI can be categorized using three aspects: local/global decision-making, self-explaining models, and model-aware/model-agnostic methods Output of XAI can differ in terms of meaning XAI evaluation quantitatively assesses the reliability of an explanation XAI evaluation properties include robustness, complexity, localization, randomization, and faithfulness Workflow includes training a model, applying XAI methods, and using XAI evaluation to compare and rank methods Evaluation metrics are assessed for compatibility with climate data properties Guideline established for using XAI evaluation to choose an optimal explanation Data and methods Data Data is simulated by the general climate model, CESM1 Data consists of 40 ensemble members Data is global 2-m air temperature maps from 1920 to 2080 Data is processed by computing annual averages and applying a bilinear interpolation Data is standardized by removing the multi-year 1920-2080 mean and dividing by the corresponding standard deviation Networks MLP and CNN are trained to solve a fuzzy classification problem MLP takes flattened temperature maps as input MLP assigns each map to one of 20 different classes Regression is used to predict the year of the input MLP and CNN have comparable number of parameters Datasets include a training and test set, 80% of data is split into training and validation set Explainable artificial intelligence (xai) Model-aware explanation methods in climate science are presented Model-agnostic explanation methods are not considered due to computational time Gradient/Saliency explains network decision by computing first partial derivative of output with respect to input InputGradient extends information content towards input image Integrated Gradients introduces baseline datapoint and computes explanation based on difference to baseline Layerwise Relevance Propagation computes relevance for each input feature by feeding network’s prediction backwards SmoothGrad, NoiseGrad, and FusionGrad perturb input features and/or network weights to account for uncertainties Evaluation techniques XAI research has developed metrics to assess different properties of explanation methods Five different evaluation properties have been analyzed, based on a classification task from Labe and Barnes [2021] Faithfulness Table 5 refers to a perturbation function called ‘Indices’ ‘Indices’ refers to the replacement of the highest value pixels in the explanation ‘Linear’ refers to noisy linear imputation Randomisation Calculations for MPT score use ‘bottom up’ approach from output layer to input layer Pearson correlation used as similarity function for both metrics Top-k considers 10% most relevant pixels of all pixels in temperature map Hyperparameters of XAI methods and evaluation metrics reported in Tables 4 and 5 respectively Maximum and minimum values of temperature maps in dataset denoted as xmax and xmin Localisation Quality of an explanation is measured based on agreement with user-defined region of interest Localization metrics assume that ROI should be mainly responsible for network decision Top-k-pixel and relevance-rank-accuracy are used to measure localization Complexity assesses how evidence values are distributed across explanation map Complexity Complexity is a measure of conciseness Explanations should consist of few strong features Complexity and sparseness are used as metric functions Low entropy is desirable Network predictions, explanations and motivating example Evaluated network performance and discussed application of explanation methods for both network architectures Fixed hyperparameters and fuzzy classification setup for MLP and CNN during training MLP and CNN have similar performance compared to primary publication Classification accuracy of both networks agrees within error bounds Calculated explanation maps for all temperature maps correctly predicted Applied XAI methods to explain predictions of both MLP and CNN Different XAI methods provide different relevances Assessment of xai metrics Evaluated XAI evaluation properties for classification task on MLP Analyzed two representative metrics for each property Based analysis on three criteria: coherence, score stability, and information value Provided artificial random explanation baseline for each metric Robustness metrics pass random baseline test LRP-α-β has highest robustness scores FusionGrad and NoiseGrad have lowest robustness scores AS and LLE scores do not align FC passes random baseline test, ROAD scores of NoiseGrad and FusionGrad overlap with random baseline MPT and RL metrics evaluated, random baseline has lowest scores Complexity and Sparseness metrics evaluated, LRP-α-β has highest complexity score, InputGradients and LRP-z have highest sparseness scores Localization metrics evaluated, FusionGrad has highest score, all other explanation methods have lower but similar scores Network-based comparison MLP and CNN networks compared using one metric per property Challenges in defining meaningful ROI for localization and defining localization as an explanation property Table 3 displays results for both networks across all properties Similarities in ranking across every category, but differences in localization and complexity due to structural differences in learned patterns Input contribution methods (Integrated Gradients, Input-Gradients, LRP) best in faithfulness, robustness, and complexity Gradient-based methods (Gradient, SmoothGrad, NoiseGrad, FusionGrad) best in randomization LRP-α-β and LRP-composite low rankings in faithfulness category Explanation-enhancing procedures (SmoothGrad, Integrated Gradients, FusionGrad, NoiseGrad) no improvement of explanation performance Spyder plot (Table 3 and Figure 8) used to determine best-performing XAI method Choosing a xai method XAI evaluation can be used to select an appropriate XAI method....

An Information-Theoretic Perspective on Variance-Invariance-Covariance Regularization

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract Provides an information-theoretic perspective on Variance-Invariance-Covariance Regularization (VICReg) Demonstrates how information-theoretic quantities can be obtained for deterministic networks Relates VICReg objective to mutual information maximization Derives a generalization bound for VICReg Presents new self-supervised learning methods derived from a mutual information maximization objective Paper Content Introduction Self-Supervised Learning (SSL) methods learn representations by optimizing a surrogate objective between inputs and self-defined signals....

FAIR-Ensemble: When Fairness Naturally Emerges From Deep Ensembling

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract Ensembling independent deep neural networks (DNNs) can improve top-line metrics and outperform larger single models. Ensembling can improve subgroup performances, such as worst-k and minority group performance. Gains in performance from ensembling for the minority group continue for longer than for the majority group. Ensembling can be a powerful tool for alleviating disparate impact from DNN classifiers....