期刊名称:Proceedings of the National Academy of Sciences
印刷版ISSN:0027-8424
电子版ISSN:1091-6490
出版年度:2009
卷号:106
期号:16
页码:6712-6717
DOI:10.1073/pnas.0901902106
语种:English
出版社:The National Academy of Sciences of the United States of America
摘要:Although genomewide association studies have successfully identified associations of many common single-nucleotide polymorphisms (SNPs) with common diseases, the SNPs implicated so far account for only a small proportion of the genetic variability of tested diseases. It has been suggested that common diseases may often be caused by rare alleles missed by genomewide association studies. To identify these rare alleles we need high-throughput, high-accuracy resequencing technologies. Although array-based genotyping has allowed genomewide association studies of common SNPs in tens of thousands of samples, array-based resequencing has been limited for 2 main reasons: the lack of a fully multiplexed pipeline for high-throughput sample processing, and failure to achieve sufficient performance. We have recently solved both of these problems and created a fully multiplexed high-throughput pipeline that results in high-quality data. The pipeline consists of target amplification from genomic DNA, followed by allele enrichment to generate pools of purified variant (or nonvariant) DNA and ends with interrogation of purified DNA on resequencing arrays. We have used this pipeline to resequence {approx}5 Mb of DNA (on 3 arrays) corresponding to the exons of 1,500 genes in >473 samples; in total >2,350 Mb were sequenced. In the context of this large-scale study we obtained a false positive rate of {approx}1 in 500,000 bp and a false negative rate of {approx}10%.