arxiv-summary: AI-summarized AI papers

Effective Brain Connectome: the whole-brain effective connectivity from neural perturbational inference

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract The effective brain connectome (EBC) is important for understanding information processing in the brain. A comprehensive mapping of the human EBC has not been achieved. A data-driven computational framework called Neural Perturbational Inference (NPI) is used to derive the human EBC. The NPI-inferred human EBC reveals log-normally distributed strengths of both excitatory and inhibitory connections....

Integrated information theory (IIT) 4.0: Formulating the properties of phenomenal existence in physical terms

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract Presents Integrated Information Theory (IIT) 4.0 Aims to explain properties of experience in physical terms Identifies essential properties of experience and expresses them mathematically Can be applied to any system to determine if it is conscious Makes testable predictions and allows inferences and extrapolations Includes more accurate translation of axioms, a measure of intrinsic information, and an assessment of causal relations Unfolds a system’s cause-effect power to explain the quality of experience Paper Content Introduction Consciousness is subjective and IIT aims to explain it in physical terms IIT starts with the existence of an experience, which is immediate and irrefutable IIT has five axioms of phenomenal existence: intrincicality, information, integration, exclusion, and composition Physical existence is assessed operationally from within consciousness IIT proposes a fundamental explanatory identity: an experience is identical to the cause-effect structure unfolded from a maximal substrate IIT has a mathematical framework to evaluate self-consistency and make predictions IIT has been refined and extended over time IIT has a book and wiki to explain motivation, axioms, postulates, and assumptions Postulates of physical existence Realism: Assumption of a world that persists independently of one’s experience Physicalism: Something must have the power to take and make a difference in a reliable way to be granted physical existence Operational Reductionism: Establishing what exists in physical terms by starting from the smallest units Intrinsicality: Substrate of consciousness must have intrinsic cause-effect power Information: Substrate of consciousness must have specific cause-effect power Integration: Substrate of consciousness must have unitary cause-effect power Exclusion: Substrate of consciousness must have definite cause-effect power Composition: Substrate of consciousness must have structured cause-effect power The explanatory identity between experiences and φ-structures IIT proposes an explanatory identity that states that the physical properties of a complex can explain the properties of an experience....

MAUVE Scores for Generative Models: Theory and Practice

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract Generative AI can generate text and images that look like they were made by humans. MAUVE is a family of comparison measures to measure how close generated data is to real data. There are four approaches to estimate these scores. MAUVE can be used to measure the gap between human-written text and modern neural language models....

Learning 3D Human Pose Estimation from Dozens of Datasets using a Geometry-Aware Autoencoder to Bridge Between Skeleton Formats

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract Deep learning-based 3D human pose estimation requires large amounts of labeled data. Different datasets have different skeleton formats, making it difficult to combine them. Separate output heads for different skeletons results in inconsistent depth estimates. A novel affine-combining autoencoder (ACAE) method is proposed to reduce the number of landmarks. 28 3D human pose datasets are used to supervise one model, which outperforms prior work....

GPT Takes the Bar Exam

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract Nearly all US jurisdictions require a professional license exam (the Bar Exam) to practice law To sit for the exam, applicants must complete 7 years of post-secondary education, including 3 years at an accredited law school Despite significant investment of time and capital, 1 in 5 test-takers still fail on their first try OpenAI’s GPT-3....

Learning One Abstract Bit at a Time Through Self-Invented Experiments Encoded as Neural Networks

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract Finding answers to given questions is important in science Coming up with good questions is important in science Artificial scientists can learn to answer given questions and invent new questions Artificial scientists are biased towards simpler, least costly experiments with surprising outcomes An empirical analysis of automatic generation of interesting experiments is presented Paper Content Introduction & previous work Two important things in science: finding answers to given questions and coming up with good questions Artificial systems can be used to implement creative part of science Artificial scientists equipped with artificial curiosity and creativity have been published for 3 decades Artificial Q&A system designed to invent and answer questions was the intrinsic motivation-based adversarial system from 1990 Two artificial NNs: controller C and world model M M minimizes its error, C tries to find sequences of output actions that maximize the error of M Artificial Q&A system from 1997 can ask arbitrary abstract questions with computable answers Reward-maximizing C tries to come up with questions whose answers surprise the other Artificial scientists maximize the sum of external rewards and intrinsic rewards POWERPLAY framework (2011) enumerates the set of all formalisable questions One Big Net For Everything offers a simplified NN version of POWERPLAY Empirical investigation of two settings: generation of experiments driven by model prediction error and approach where C generates pure thought experiments in form of weight matrices of RNNs Self-invented experiments encoded as neural networks System allows for design of computational experiments with binary yes/no outcomes Experiments can run for multiple time steps Controller and model can be implemented as LSTMs Controller has START unit to propose experiments Experiment has HALT and RESULT units Experiment outcome is 1 if RESULT unit > 0....

'Real Attackers Don't Compute Gradients': Bridging the Gap Between Adversarial ML Research and Practice

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract Research on adversarial machine learning has increased in recent years. Attackers use simple tactics to subvert ML-driven systems. This paper aims to bridge the gap between researchers and practitioners. Paper Content I. introduction Protecting ML models is not a leading security concern for practitioners Attackers can break ML systems by guessing or using coarse heuristics Defensive recommendations are more broad than adversarial training ML models are often not directly observable by attackers Researchers should not rely on “security by obscurity” Thousands of papers have showcased successful security violations of ML models Gap between adversarial ML research and practice is real Four positions to close the gap between research and practice Arxiv:2212....

What Estimators Are Unbiased For Linear Models?

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract Hansen [2022] proved that the Gauss-Markov theorem holds without the requirement that competing estimators are linear in the vector of outcomes. Hansen [2022] added statements in the latest version with new conditions under which nonlinear unbiased estimators exist. Study a fundamental problem: what estimators are unbiased for a given class of linear models?...

Cramming: Training a Language Model on a Single GPU in One Day

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract Recent trends in language modeling focus on increasing performance through scaling Training language models is out of reach for most researchers and practitioners Investigating how far can be achieved with a single GPU in one day Re-analyzing components of the pretraining pipeline and providing a modified pipeline Investigating why scaling down is hard and which modifications improve performance Performance follows scaling laws observed in large-compute settings Categorizing recent improvements to training and architecture and discussing their merit Paper Content Scaling up and scaling down Large-scale training of machine learning models with transformer architectures has improved natural language processing Performance of these systems increases when number of model parameters and amount of data grow Power of scale has created an environment where few researchers or practitioners feel capable of training a language model BERT model requires significant amount of computation to train Competition for largest language model has become a focal point for industrial labs Goal is to investigate how to best scale down language model training and what trade-offs emerge Scaled-down model pretraining opens up a host of further academic investigations Tying our hands behind our back: a setup with limited compute Training a transformer-based language model from scratch No pre-trained models allowed Raw text can be included for training Pre-processing of raw data is exempted from compute budget Training on single GPU for 24 hours Downstream performance evaluated on GLUE Preprint Related work on efficient transformers Training BERT requires varying hardware and software setups....

Demonstrate-Search-Predict: Composing retrieval and language models for knowledge-intensive NLP

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract Retrieval-augmented in-context learning is a powerful approach for knowledge-intensive tasks. Existing work combines language models and retrieval models in a “retrieve-then-read” pipeline. Demonstrate-Search-Predict (DSP) is a framework that passes natural language texts between an LM and an RM. DSP can express high-level programs that break down problems into small transformations. Novel DSP programs have been written for answering questions in open-domain, multi-hop, and conversational settings....