Causal Lifting and Link Prediction

Causal Lifting and Link Prediction

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract State-of-the-art causal models assume inherent node factors that govern link formation. Link formation can be path-dependent, meaning the outcome of link interventions depends on existing links. Existing causal methods are impractical in these scenarios. This work develops the first causal model capable of dealing with path dependencies in link prediction. Paper Content Introduction Charles Spearman published The Abilities of Man in 1927, which described mathematical tools to uncover latent common factors of intelligence....

February 2, 2023 · 1832 words · Leonardo Cotta, Beatrice Bevilacqua, Nesreen Ahmed, Bruno Ribeiro
Multimodal Chain-of-Thought Reasoning in Language Models

Multimodal Chain-of-Thought Reasoning in Language Models

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract LLMs have shown good performance on complex reasoning by using CoT prompting Existing CoT studies are mostly limited to language modality A possible solution is to fine-tune small language models by fusing vision and language features The challenge is that language models tend to generate wrong reasoning chains Multimodal-CoT incorporates vision features in a decoupled training framework Multimodal-CoT outperforms previous state-of-the-art LLM by 16% on ScienceQA benchmark Paper Content Introduction Knowledge acquisition is strengthened by modeling diverse data modalities LLMs generate intermediate reasoning steps before inferring the answer (CoT reasoning) Existing studies related to CoT reasoning are largely isolated in the language modality Multimodal-CoT decomposes multi-step problems into intermediate reasoning steps and then infers the answer Two ways to perform Multimodal-CoT: prompting LLMs and fine-tuning small models Multimodal-CoT proposed to incorporate vision features in a decoupled training framework New state-of-the-art performance on the ScienceQA benchmark, outperforming accuracy of GPT-3....

February 2, 2023 · 1070 words · Zhuosheng Zhang, Aston Zhang, Mu Li, Hai Zhao, George Karypis and 1 others
Collaborating with language models for embodied reasoning

Collaborating with language models for embodied reasoning

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract RL agents can solve difficult tasks but require a lot of training data and struggle to generalize. LSLMs have strong reasoning ability and can adapt to new tasks, but don’t have the ability to interact with the environment. This work combines the complementary abilities of RL and LSLMs into a single system with three parts: Planner, Actor, and Reporter....

February 1, 2023 · 673 words · Ishita Dasgupta, Christine Kaeser-Chen, Kenneth Marino, Arun Ahuja, Sheila Babayan and 2 others
Improving Few-Shot Generalization by Exploring and Exploiting Auxiliary Data

Improving Few-Shot Generalization by Exploring and Exploiting Auxiliary Data

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract Few-shot learning involves learning an effective model from only a few labeled datapoints. FLAD is a training paradigm that uses auxiliary data to improve generalization. Automated sampling strategies are related to the explore-exploit dilemma. Two algorithms are proposed and compared with methods that either explore or exploit. Using the proposed algorithms yields a 9% absolute improvement....

February 1, 2023 · 952 words · Alon Albalak, Colin Raffel, William Yang Wang
Continuous U-Net: Faster, Greater and Noiseless

Continuous U-Net: Faster, Greater and Noiseless

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract Image segmentation is a fundamental task in image analysis and clinical practice. Current state-of-the-art techniques are based on U-shape type encoder-decoder networks with skip connections (U-Net). U-Net has limitations such as hard coding of the receptive field size, not accounting for inherent noise in the data, problems associated with discrete layers, and no theoretical underpinning....

February 1, 2023 · 924 words · Chun-Wun Cheng, Christina Runkel, Lihao Liu, Raymond H Chan, Carola-Bibiane Schönlieb and 1 others
Synthetic Prompting: Generating Chain-of-Thought Demonstrations for Large Language Models

Synthetic Prompting: Generating Chain-of-Thought Demonstrations for Large Language Models

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract Large language models can use chain-of-thought prompting to find answers. Creating many prompts by hand is costly. Synthetic prompting uses a few handcrafted examples to generate more examples. Synthetic prompting alternates between backward and forward processes. Evaluations show Synthetic prompting outperforms existing prompting techniques. Paper Content Introduction Few-shot demonstrations can enable LLMs to perform tasks without fine-tuning Chain-of-thought prompting can further improve LLMs’ performance Quality of demonstrations is important for complex reasoning tasks SYNTHETIC PROMPTING augments a limited set of demonstrations with self-synthesized examples In-cluster complexity based scheme is proposed to select diverse and informative demonstrations SYNTHETIC PROMPTING achieves up to 15....

February 1, 2023 · 625 words · Zhihong Shao, Yeyun Gong, Yelong Shen, Minlie Huang, Nan Duan and 1 others
Zero Shot Transfer of Legal Judgement Prediction as Article-aware Entailment for the European Court of Human Rights

Zero Shot Transfer of Legal Judgement Prediction as Article-aware Entailment for the European Court of Human Rights

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract Legal Judgment Prediction (LJP) from text on European Court of Human Rights cases is cast as an entailment task. The case outcome is classified from a combined input of case facts and convention articles. Model is evaluated on its ability to generalize to zero-shot settings. Domain adaptation methods are applied to improve zero-shot transfer performance....

February 1, 2023 · 1154 words · Santosh T. Y. S. S, Oana Ichim, Matthias Grabmair
Automatically Marginalized MCMC in Probabilistic Programming

Automatically Marginalized MCMC in Probabilistic Programming

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract HMC is an algorithm to sample latent variables from Bayesian models PPLs allow users to focus on modeling instead of writing inference algorithms HMC can be difficult to use for some models, requiring tricks like reparameterization Marginalization can simplify models and improve sampling from hierarchical models Paper Content Introduction PPLs automate Bayesian reasoning User specifies probabilistic model and provides data PPLs have had tremendous impact in applied sciences PPLs vary in distributions and inference approach Focus on generative PPLs and programs that correspond to a graphical model Can reformulate model so some latent variables are generated after all observed variables Reducing number of variables for MCMC can lead to performance gains Automatically marginalize variables in user-specified probabilistic program for inference with HMC Motivating examples Eight Schools model is an example of a hierarchical model to study the effect of coaching on SAT performance in eight schools The model is mathematically represented by µ ∼ N (0, 5 2 ), τ ∼ HalfCauchy(5) There is another model with the same joint density, but different causal interpretation HMC inference can be sped up by marginalizing x 1:8 and running HMC on the reduced model Conjugacy is a property of distribution families that allows for the transformation of the model In the eight schools model, x i is conjugate to y i given µ and τ Hierarchical linear regression is a more complex model that requires user effort to reformulate Automatically marginalized mcmc Our method will construct a graphical model and manipulate it to reduce the number of variables....

February 1, 2023 · 1313 words · Jinlin Lai, Javier Burroni, Hui Guan, Daniel Sheldon
Width and Depth Limits Commute in Residual Networks

Width and Depth Limits Commute in Residual Networks

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract Taking the width and depth of a deep neural network to infinity with a specific scaling results in the same covariance structure. This explains why the standard infinite-width-then-depth approach works even for networks with depth and width of the same order. Pre-activations have Gaussian distributions, which has applications in Bayesian deep learning....

February 1, 2023 · 1141 words · Soufiane Hayou, Greg Yang
Emerging Trends in Droplet Microfluidic Screens for Biotechnology

Emerging Trends in Droplet Microfluidic Screens for Biotechnology

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract Droplet microfluidic screens are used in biotechnology Interactions of many different biological entities can be screened Recent breakthroughs have enabled new scales of bioanalysis and biotechnological product design Recent methodology advances expand droplet-based screens to new environments Paper Content Current applications for droplet screens in biotechnology Droplet microfluidic screens involve encapsulating cells and reagents into tiny droplets and incubating them Droplet sorting is used to enrich target droplets Follow-up analysis can include DNA sequencing, imaging, cultivation, etc....

January 31, 2023 · 511 words · Carlos Vidal-Céspedes, Tobias Wenzel