期刊名称:Proceedings of the National Academy of Sciences
印刷版ISSN:0027-8424
电子版ISSN:1091-6490
出版年度:2016
卷号:113
期号:1
页码:E61-E70
DOI:10.1073/pnas.1520109113
语种:English
出版社:The National Academy of Sciences of the United States of America
摘要:Genome-wide association studies (GWASs) seek to understand the relationship between complex phenotype(s) (e.g., height) and up to millions of single-nucleotide polymorphisms (SNPs). Early analyses of GWASs are commonly believed to have “missed” much of the additive genetic variance estimated from correlations between relatives. A more recent method, genome-wide complex trait analysis (GCTA), obtains much higher estimates of heritability using a model of random SNP effects correlated between genotypically similar individuals. GCTA has now been applied to many phenotypes from schizophrenia to scholastic achievement. However, recent studies question GCTA’s estimates of heritability. Here, we show that GCTA applied to current SNP data cannot produce reliable or stable estimates of heritability. We show first that GCTA depends sensitively on all singular values of a high-dimensional genetic relatedness matrix (GRM). When the assumptions in GCTA are satisfied exactly, we show that the heritability estimates produced by GCTA will be biased and the standard errors will likely be inaccurate. When the population is stratified, we find that GRMs typically have highly skewed singular values, and we prove that the many small singular values cannot be estimated reliably. Hence, GWAS data are necessarily overfit by GCTA which, as a result, produces high estimates of heritability. We also show that GCTA’s heritability estimates are sensitive to the chosen sample and to measurement errors in the phenotype. We illustrate our results using the Framingham dataset. Our analysis suggests that results obtained using GCTA, and the results’ qualitative interpretations, should be interpreted with great caution.
关键词:GCTA ; GWAS ; heritability ; SNP ; singular value decomposition