期刊名称:Proceedings of the National Academy of Sciences
印刷版ISSN:0027-8424
电子版ISSN:1091-6490
出版年度:2010
卷号:107
期号:35
页码:15485-15490
DOI:10.1073/pnas.1010506107
语种:English
出版社:The National Academy of Sciences of the United States of America
摘要:CpG dinucleotides contribute to epigenetic mechanisms by being the only site for DNA methylation in mammalian somatic cells. They are also mutation hotspots and [~]5-fold depleted genome-wide. We report here a study focused on CpG sites in the coding regions of Hox and other transcription factor genes, comparing methylated genomes of Homo sapiens, Mus musculus, and Danio rerio with nonmethylated genomes of Drosophila melanogaster and Caenorhabditis elegans. We analyzed 4-fold degenerate, synonymous codons with the potential for CpG. That is, we studied "silent" changes that do not affect protein products but could damage epigenetic marking. We find that DNA-binding transcription factors and other developmentally relevant genes show, only in methylated genomes, a bimodal distribution of CpG usage. Several genetic code-based tests indicate, again for methylated genomes only, that the frequency of silent CpGs in Hox genes is much greater than expectation. Also informative are NCG-GNN and NCC-GNN codon doublets, for which an unusually high rate of G to C and C to G transversions was observed at the third (silent) position of the first codon. Together these results are interpreted as evidence for strong "pro-epigenetic" selection acting to preserve CpG sites in coding regions of many genes controlling development. We also report that DNA-binding transcription factors and developmentally important genes are dramatically overrepresented in or near clusters of three or more CpG islands, suggesting a possible relationship between evolutionary preservation of CpG dinucleotides in both coding regions and CpG islands.