

Title | Population diversity of ORFan genes in Escherichia coli. |
Publication Type | Journal Article |
Year of Publication | 2012 |
Authors | Yu, G, Stoltzfus, A |
Journal | Genome Biol Evol |
Volume | 4 |
Issue | 11 |
Pagination | 1176-87 |
Date Published | 2012 |
ISSN | 1759-6653 |
Keywords | Escherichia coli, Evolution, Molecular, Genes, Bacterial, Genetic Speciation, Genetic Variation, Genome, Bacterial, Open Reading Frames, Phylogeny, Pseudogenes, Shigella |
Abstract | The origin and evolution of "ORFans" (suspected genes without known relatives) remain unclear. Here, we take advantage of a unique opportunity to examine the population diversity of thousands of ORFans, based on a collection of 35 complete genomes of isolates of Escherichia coli and Shigella (which is included phylogenetically within E. coli). As expected from previous studies, ORFans are shorter and AT-richer in sequence than non-ORFans. We find that ORFans often are very narrowly distributed: the most common pattern is for an ORFan to be found in only one genome. We compared within-species population diversity of ORFan genes with those of two control groups of non-ORFan genes. Patterns of population variation suggest that most ORFans are not artifacts, but encode real genes whose protein-coding capacity is conserved, reflecting selection against nonsynonymous mutations. Nevertheless, nonsynonymous nucleotide diversity is higher than for non-ORFans, whereas synonymous diversity is roughly the same. In particular, there is a several-fold excess of ORFans in the highest decile of diversity relative to controls, which might be due to weaker purifying selection, positive selection, or a subclass of ORFans that are decaying. |
DOI | 10.1093/gbe/evs081 |
Alternate Journal | Genome Biol Evol |
PubMed ID | 23034216 |
PubMed Central ID | PMC3514957 |
Grant List | GM081511 / GM / NIGMS NIH HHS / United States |