Schmitz JF, Ullrich K, Bornberg-Bauer E
Incipient De Novo Genes Evolve from "Frozen Accidents" which Escaped Rapid Turnover of Pervasively Transcribed ORFs
, 2018

[Login to Download]


A recent surge of studies suggested that many novel genes arise de novo from previously non-coding DNA and not by duplication. However, since most studies concentrated on longer evolutionary time scales and rarely considered protein structural properties, it remains unclear how these properties are shaped by evolution, depend on genetic mechanisms and influence gene survival. Here we compare open reading frames (ORFs) from high coverage transcriptomes from mouse and another four mammals covering 160 million years of evolution. We find that novel ORFs pervasively emerge from non-coding regions but are rapidly lost again while relatively fewer arise from divergence of coding sequences but are retained over much longer times. We also find a subset (14%) of the mouse-specific ORFs to be translated, showing that such ORFs can be the starting points of gene emergence. Surprisingly, disorder and other protein properties of young ORFs hardly change with gene age in short time frames. Only length and nucleotide composition change significantly. Thus some transcribed de novo genes resemble frozen accidents of randomly emerged ORFs which survived initial purging.