arxiv-summary: AI-summarized AI papers

Adding Conditional Control to Text-to-Image Diffusion Models

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract ControlNet is a neural network structure used to control pretrained large diffusion models. ControlNet can be trained on small datasets (< 50k) and is as fast as fine-tuning a diffusion model. ControlNet can be trained on personal devices or powerful computation clusters. ControlNet can be used to enable conditional inputs like edge maps, segmentation maps, keypoints, etc....

Thermodynamic AI and the fluctuation frontier

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract AI algorithms are inspired by physics and use stochastic fluctuations Thermodynamic AI is a mathematical framework that unifies these algorithms Thermodynamic AI hardware uses stochastic fluctuations as a computational resource Thermodynamic AI hardware is a novel form of computing using s-bits and s-modes Paper Content Ii. stochasticity as a computing resource Fluctuation is used to describe deviation from average value Stochasticity is a precise mathematical description Stochasticity is a resource that can be used to accomplish tasks Randomness is a resource used in cryptography and computing Stochasticity and randomness can be interconverted Stochasticity can be used in generative modeling, optimization algorithms and financial asset integration Iii....

Languages are Rewards: Hindsight Finetuning using Human Feedback

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract Learning from human preferences is important for language models to be helpful and useful. Existing works focus on supervised finetuning of pretrained models based on preferred data. Supervised finetuning cannot learn from negative ratings, making it data inefficient. Hindsight Finetuning proposed to make language models learn from diverse human feedback. Hindsight Finetuning motivated by how humans learn from hindsight experience....

A Modified CTGAN-Plus-Features Based Method for Optimal Asset Allocation

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract Proposed a new approach to portfolio optimization that combines synthetic data generation and a CVaR-constraint Formulated the portfolio optimization problem as an asset allocation problem Used Modified CTGAN algorithm to generate synthetic return scenarios Rely on several points along the U.S. Treasury yield curve for contextual information Demonstrated merits of approach with an example based on ten asset classes Synthetic generation process captures key characteristics of original data Optimization scheme results in portfolios with satisfactory out-of-sample performance Outperforms conventional equal-weights (1/N) asset allocation strategy and other optimization formulations based on historical data only Paper Content Motivation and previous work The portfolio selection problem is one of the oldest problems in applied finance Before 1952, the issue was tackled with gut feeling, intuition, and common sense Harry Markowitz published a paper in 1952 that showed the portfolio selection problem was an optimization problem The key ideas behind Markowitz’s framework (diversification, risk/return tradeoff, efficient frontier) have survived well Implementing Markowitz’s approach has been problematic due to difficulty in estimating correlation-of-returns matrix and the use of standard deviation to describe risk Most research efforts have been aimed at devising practical strategies to implement the MV formulation John Bogle introduced the concept of passive investment in 1975, which shifted the emphasis from asset selection to asset allocation The Conditional-Value-at-Risk (CVaR) has become the risk metric of choice Generative Adversarial Networks (GANs) have been used to generate realistic synthetic data The joint behavior of a group of assets can fluctuate between discrete states, known as market regimes Features have been incorporated to the formulation of optimization problems The goal is to propose a method to tackle the portfolio selection problem based on an asset allocation approach Problem formulation Investor seeks to maximize return by selecting appropriate exposure to each asset class while keeping overall portfolio risk below predefined tolerance level Risk metric chosen is Conditional-Value-at-Risk (CVaR) CVaR chosen because it avoids losses better than standard deviation of returns CVaR is convex and coherent Problem can be written in discretized and linear fashion Sampled data from relevant probability distribution of returns used in combination with discrete probability density function Weights π can be modified to adjust formulation for case with features Synthetic data generation Generating random samples from a given probability density function is a straightforward task in principle....

The unreasonable effectiveness of few-shot learning for machine translation

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract Few-shot translation systems can be trained with unpaired language data. With only 5 examples of high-quality translation data, a transformer decoder-only model can match specialized supervised state-of-the-art models. Few-shot translation systems do not require joint multilingual training or back-translation. Few-shot translation systems are two orders of magnitude smaller than state-of-the-art language models....

Effective Robustness against Natural Distribution Shifts for Models with Different Training Data

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract Measures extra out-of-distribution robustness beyond what can be predicted from in-distribution performance Existing evaluations typically use a single test set to evaluate in-distribution accuracy Proposes a new evaluation metric to compare effective robustness of models trained on different data distributions Controls for accuracy on multiple in-distribution test sets that cover the training distributions for all evaluated models Paper Content Introduction Robustness against distribution shifts is important for machine learning models....

Dreamix: Video Diffusion Models are General Video Editors

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract Text-driven image and video diffusion models have achieved high generation realism. Few works have done text-based motion and appearance editing of general videos. Our approach combines low-resolution information from the original video with new, high resolution information. We propose a mixed objective to improve motion editability. We introduce a new framework for image animation....

Accelerating Large Language Model Decoding with Speculative Sampling

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract Speculative sampling is an algorithm for accelerating transformer decoding. It enables the generation of multiple tokens from each transformer call. It uses a faster but less powerful draft model to generate short continuations. A modified rejection sampling scheme is used to preserve the distribution of the target model. Speculative sampling was benchmarked with a 70 billion parameter language model, resulting in a 2-2....

Double Permutation Equivariance for Knowledge Graph Completion

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract Formalization of Knowledge Graphs as a new class of graphs Double-permutation equivariance for KG representations Structural representation of relations allows neural networks to perform complex logical reasoning tasks General blueprint for equivariant representations GNN-based double-permutation equivariant neural architecture achieves 100% Hits@10 test accuracy Paper Content Introduction Knowledge graphs are structured representations of facts in the form of triplets....

Explaining wall-bounded turbulence through deep learning

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract Wall-bounded turbulence is an unresolved problem. Interactions among coherent structures in the flow are explored. A deep-learning method is used to predict the velocity field in time. A game-theoretic algorithm is used to assess the importance of each structure. Results are in agreement with previous observations in the literature. The process has the potential to shed light on numerous fundamental phenomena....