arxiv-summary: AI-summarized AI papers

ReCode: Robustness Evaluation of Code Generation Models

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract Code generation models have achieved impressive performance, but tend to be brittle. Robustness in code generation tasks is an uncharted area and there is no comprehensive benchmark for robustness. ReCode is a comprehensive robustness evaluation benchmark for code generation models. ReCode includes over 30 transformations specifically for code. ReCode takes advantage of the fact that executing the generated code can serve as objective evaluation....

On the Role of Parallel Data in Cross-lingual Transfer Learning

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract Parallel data is beneficial for cross-lingual learning. It is unclear if the improvements come from the data or the modeling of parallel interactions. Unsupervised machine translation can generate synthetic parallel data. Synthetic parallel data can be useful for downstream tasks. Real parallel data still yields the best results. Multilingual models do not exploit the full potential of monolingual data....

Multi-asset market making under the quadratic rough Heston

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract A quadratic rough Heston model is used to model the joint behavior of SPX and VIX A market maker is trying to maximize their profit from spread capturing while controlling the portfolio’s inventory risk The optimization problem is high dimensional and is relaxed by several approximations An asymptotic closed-form solution is obtained Numerical experiments are used to illustrate the accuracy and relevance of the approximations Paper Content Introduction Constant volatility assumption in Black-Scholes model is not consistent with empirical observations Stochastic volatility models can reproduce stylized facts of historical data Implied volatility surfaces generated by conventional models differ from empirical observations Rough volatility paradigm brings new solutions that achieve superior fits of implied volatility surfaces Quadratic rough Heston (QRH) model models price of asset and its spot variance QRH model encodes Zumbach effect Price returns largely explain volatility QRH model gives opportunities to model SPX derivatives Market making problem is formulated as dynamic programming problem Mean-variance type objective function is considered Multi-asset market making problem is exposed to curse of dimensionality Factor decomposition and deep neural networks are used to reduce dimension of problem Closed-form approximations are obtained by replacing Hamiltonian functions with quadratic ones Market maker decides whether to submit limit orders at best limits or get immediate execution using market orders Market maker tries to maximize expected gain from spread capturing while controlling inventory risk Multi-factor approximation of the qrh model Multi-factor approximation of the QRH model introduced in [28] Model is Markovian Prices of derivatives can be obtained as a function of the risk-neutral measure Vanilla SPX and VIX options can be computed using neural networks PDE can be solved using deep learning Multi-asset market making Problem is considered over a period of time T Market maker decides whether to make a market at the limits P j t plus/minus one-half tick size Two point processes modeling the number of transactions at the bid and ask size Dynamics of the inventory process (q j t ) t∈[0,T ] of asset j is given by Dynamics of the cash process (Y t ) t∈[0,T ] of the market maker is given by Objective function of the market maker is to maximize expected terminal wealth while penalizing inventory risk Value function has (2 + n + d) variables The hamilton-jacobi-bellman equations QRH model has multidimensional nature Approximation reduces dimensionality Time horizon of problem is relatively short Numerical tests on simulated and market data show effectiveness of daily hedging Market maker can reset algorithm with updated parameters Value function satisfies q j s δ j ) 2 ds HJB equation associated with problem is given by system of ODEs Existence and uniqueness of solution of equation with terminal condition is given Optimal market making decisions are given by verification argument When market maker controls only portfolio’s net risk, variable q can be summarized with one variable Quadratic approximation Equation (3....

Goal-oriented Autonomous Driving

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract Autonomous driving systems are composed of modular tasks in sequential order. There is a trend to develop systems that can perform a wide variety of tasks. Contemporary approaches use either standalone models or multi-task paradigms. A favorable algorithm framework should be devised and optimized for planning. The key components of perception and prediction are analyzed and prioritized....

Large Language Models Are Reasoning Teachers

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract Language models have been used to solve complex reasoning tasks. Chain-of-thought prompting has been used to help language models solve complex tasks, but it requires very large models. This paper proposes a method to enable complex reasoning in smaller models. The method is evaluated on publicly available language models across a range of tasks and model sizes....

Language Modeling with Latent Situations

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract Language models often generate incoherent outputs. SituationSupervision is a family of approaches to improve coherence in LMs. SituationSupervision has two components: an auxiliary situation modeling task and a latent state inference procedure. SituationSupervision can be applied to fine-tuning and prompting. SituationSupervision requires only a small number of state annotations to produce major coherence improvements....

CoCoMIC: Code Completion By Jointly Modeling In-file and Cross-file Context

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract Pre-trained language models for code have achieved success in code completion, but only use in-file context. Cross-file context is a critical source of information for modern modular software development. CCFINDER is a tool that locates and retrieves relevant cross-file context. CoCoMIC is a framework that incorporates cross-file context to learn in-file and cross-file context jointly....

(QA)$^2$: Question Answering with Questionable Assumptions

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract Questions with questionable assumptions are difficult to answer. (QA)$^2$ is an open-domain evaluation dataset of naturally-occurring search engine queries that may or may not contain questionable assumptions. Current models struggle with handling questionable assumptions. Paper Content Introduction Naturally occurring information-seeking questions often contain false or unverifiable assumptions. Examples of such questions are given....

Defending Against Poisoning Attacks in Open-Domain Question Answering

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract Recent work has shown that adversarial poisoning of input contexts can cause large drops in accuracy for production systems. Little to no work has proposed methods to defend against these attacks. A new method is proposed that uses query augmentation to search for a diverse set of retrieved passages. A novel confidence method is designed to compare the predicted answer to its appearance in the retrieved contexts....

Towards Understanding Chain-of-Thought Prompting: An Empirical Study of What Matters

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract Chain-of-Thought (CoT) prompting can improve the multi-step reasoning abilities of large language models (LLMs). CoT prompting can achieve over 80-90% of the performance obtained using CoT with invalid demonstrations. Relevance to the query and correctly ordering the reasoning steps are important for effective CoT reasoning. Paper Content Introduction LLMs can perform new tasks when prompted with a few demonstrations Chain-of-Thought (CoT) prompting can improve LLMs ability to do complex and multi-step reasoning CoT prompting includes a rationale for each example, which encourages the LLM to generate its intermediate reasoning process Recent findings show that in-context learning is different from fine-tuning/training We study how and why CoT prompting works We find that the validity of reasoning matters only a small portion to the performance We identify and formulate other aspects of a CoT rationale LLMs have already gained a lot of “reasoning abilities” from pretraining Demonstrations specify an output space/format that regularizes the model generation Evaluation scores should be interpreted in view of the prior knowledge LLMs possess Study formulation Chain-of-Thought rationale consists of a series of reasoning steps Two components of a CoT rationale: Bridging objects and Language templates Bridging objects are key and necessary objects for the model to make a successful prediction Language templates are textual hints and relations/predicates that guide the model Questions: Are ground truth bridging objects/language templates important?...