Schmitz JF
Analysis of intergenic ORFs reveals low potential for de novo protein-coding gene emergence in bacteria
, 2017

While de novo gene birth has been exten-
sively studied in eukaryotes where a high number of
confirmed cases have been found, the possibility of de
novo protein-coding gene birth in bacteria has rarely
been investigated. Here, we pursue the question of if
de novo gene birth is possible in bacteria and which
condition could influence this process. To this end, we
create random genomes based on the DNA composition
of a number of different bacterial genomes and analyse
the open reading frames (ORFs) found in these ran-
dom genomes. While many ORFs seem to exhibit se-
quence properties that could allow for the emergence
of protein coding genes, ORFs with very high/low GC-
content seem to be less likely to do so. Additionally,
we analyse intergenic and anti-sense ORFs found in
Escherichia coli and find these to be similar in terms
of sequence properties to actual E. coli protein-coding
genes. As a result, it seems that de novo protein-coding
gene emergence is possible in bacteria but is compli-
cated by extremely low/high GC-content.