Link to paper The full paper is available here.
You can also find the paper on PapersWithCode here.
Abstract A framework for performing valid statistical inference when an experimental data set is supplemented with predictions from a machine-learning system No assumptions are made on the machine-learning algorithm Higher accuracy of predictions leads to smaller confidence intervals Algorithms for computing valid confidence intervals for statistical objects Demonstrated with data sets from proteomics, genomics, electronic voting, remote sensing, census analysis, and ecology Paper Content Introduction Machine-learning algorithms are used to make predictions Predictions can be used to generate predictions for entities not studied experimentally Examples of predictions include molecular activity, tumor prognoses, and microclimatic modeling Analysis was done to form a confidence interval for the fraction of the Amazon rainforest lost between 2000 and 2015 Gold-standard labels were collected through field visits, an expensive process Machine-learning predictions of forest cover were also available, based on satellite imagery Two natural alternatives for constructing confidence intervals were used Imputation approach yields a small confidence interval that fails to cover the true deforestation fraction Classical approach covers the truth at the expense of a wider interval Second example used to construct a confidence interval for an odds ratio between phosphorylation and disorder Imputation approach significantly overestimates the true odds ratio Prediction-powered inference framework provides an affirmative answer to the question of whether predictions can improve inferential quality Confidence intervals cover the truth and are smaller than those obtained using the classical approach General principle Goal is to estimate a quantity θ* Have access to small gold-standard data set and large unlabeled data set Use predictions from machine-learning algorithm to estimate θ* Introduce problem-specific measure of prediction error called rectifier, ∆f Use gold-standard data to construct confidence set for rectifier, R Construct confidence set C PP by rectifying θf with each value in R Confidence intervals and p-values for a broad class of statistical problems Further preliminaries Labeled data set is denoted as (X, Y) ∈ (X ×Y) n Unlabeled data set is denoted as ( X, Y ) ∈ (X × Y) N Data sets are assumed to be i....