摘要:We define a new measure of variable importance of an exposure on a continuous outcome, accounting for potential confounders. The exposure features a reference level $x_{0 with positive mass and a continuum of other levels. For the purpose of estimating it, we fully develop the semi-parametric estimation methodology called targeted minimum loss estimation methodology (TMLE) [23,22]. We cover the whole spectrum of its theoretical study (convergence of the iterative procedure which is at the core of the TMLE methodology; consistency and asymptotic normality of the estimator), practical implementation, simulation study and application to a genomic example that originally motivated this article. In the latter, the exposure $X$ and response $Y$ are, respectively, the DNA copy number and expression level of a given gene in a cancer cell. Here, the reference level is $x_{0}=2$, that is the expected DNA copy number in a normal cell. The confounder is a measure of the methylation of the gene. The fact that there is no clear biological indication that $X$ and $Y$ can be interpreted as an exposure and a response, respectively, is not problematic.
关键词:Variable importance measure;non-parametric estimation;targeted minimum loss estimation;robustness;asymptotics.