期刊名称:Proceedings of the National Academy of Sciences
印刷版ISSN:0027-8424
电子版ISSN:1091-6490
出版年度:2020
卷号:117
期号:42
页码:25963-25965
DOI:10.1073/pnas.2018002117
出版社:The National Academy of Sciences of the United States of America
摘要:“Felix, qui potuit rerum cognoscere causas,” from the Latin poet Virgil (1), literally translated as “Fortunate, who was able to know the causes of things,” hints at the importance of causality since a very long time ago. In PNAS, Bates et al. (2) start their contribution with the sentence “The ultimate aim of genome-wide association studies (GWAS) is to identify regions of the genome containing variants that causally affect a phenotype of interest,” and they provide a highly innovative and original statistical methodology to provide sound answers to this aim. As we will argue, the causal inference problem is ambitious, and one has to rely on assumptions. The assumptions in ref. 2 are easy to communicate; the ability to communicate underlying assumptions makes their approach transparent, and, in our own assessment, their assumptions are very plausible. When we observe correlation or dependence between some variables of interest, a main question is about the directionality: whether one variable is the cause or the effect of another one. Of course, it may happen that neither is true, because of hidden confounding. See Fig. 1 for a schematic view where all observed variables are exhibiting association dependence between each other but these are, in part, arising due to unseen hidden factors. If we were able to gain knowledge of causal directionality, obviously, this would lead to much improvement in understanding and interpretability of an underlying system. In Fig. 1, this means to infer the directed causal relations between the observed variables.