期刊名称:Proceedings of the National Academy of Sciences
印刷版ISSN:0027-8424
电子版ISSN:1091-6490
出版年度:2021
卷号:118
期号:51
DOI:10.1073/pnas.2106844118
语种:English
出版社:The National Academy of Sciences of the United States of America
摘要:Significance
Genetic investigations of most structural birth defects, including spina bifida (SB), congenital heart disease, and craniofacial anomalies, have been underpowered for genome-wide association studies because of their rarity, genetic heterogeneity, incomplete penetrance, and environmental influences. Our systems biology strategy to investigate SB predisposition controls for population stratification and avoids much of the bias inherent in candidate gene searches that are pervasive in the field. We examine both protein coding and noncoding regions of whole genomes to analyze sequence variants, collapsed by gene or regulatory region, and apply machine learning, gene enrichment, and pathway analyses to elucidate molecular pathways and genes contributing to human SB.
Spina bifida (SB) is a debilitating birth defect caused by multiple gene and environment interactions. Though SB shows non-Mendelian inheritance, genetic factors contribute to an estimated 70% of cases. Nevertheless, identifying human mutations conferring SB risk is challenging due to its relative rarity, genetic heterogeneity, incomplete penetrance, and environmental influences that hamper genome-wide association studies approaches to untargeted discovery. Thus, SB genetic studies may suffer from population substructure and/or selection bias introduced by typical candidate gene searches. We report a population based, ancestry-matched whole-genome sequence analysis of SB genetic predisposition using a systems biology strategy to interrogate 298 case-control subject genomes (149 pairs). Genes that were enriched in likely gene disrupting (LGD), rare protein-coding variants were subjected to machine learning analysis to identify genes in which LGD variants occur with a different frequency in cases versus controls and so discriminate between these groups. Those genes with high discriminatory potential for SB significantly enriched pathways pertaining to carbon metabolism, inflammation, innate immunity, cytoskeletal regulation, and essential transcriptional regulation consistent with their having impact on the pathogenesis of human SB. Additionally, an interrogation of conserved noncoding sequences identified robust variant enrichment in regulatory regions of several transcription factors critical to embryonic development. This genome-wide perspective offers an effective approach to the interrogation of coding and noncoding sequence variant contributions to rare complex genetic disorders.