Repository logo
 

Genomic repertoires of DNA-binding transcription factors across the tree of life.

Published version
Peer-reviewed

Type

Article

Change log

Authors

Charoensawan, Varodom 
Wilson, Derek 
Teichmann, Sarah A 

Abstract

Sequence-specific transcription factors (TFs) are important to genetic regulation in all organisms because they recognize and directly bind to regulatory regions on DNA. Here, we survey and summarize the TF resources available. We outline the organisms for which TF annotation is provided, and discuss the criteria and methods used to annotate TFs by different databases. By using genomic TF repertoires from ∼700 genomes across the tree of life, covering Bacteria, Archaea and Eukaryota, we review TF abundance with respect to the number of genes, as well as their structural complexity in diverse lineages. While typical eukaryotic TFs are longer than the average eukaryotic proteins, the inverse is true for prokaryotes. Only in eukaryotes does the same family of DNA-binding domain (DBD) occur multiple times within one polypeptide chain. This potentially increases the length and diversity of DNA-recognition sequence by reusing DBDs from the same family. We examined the increase in TF abundance with the number of genes in genomes, using the largest set of prokaryotic and eukaryotic genomes to date. As pointed out before, prokaryotic TFs increase faster than linearly. We further observe a similar relationship in eukaryotic genomes with a slower increase in TFs.

Description

Keywords

Animals, Catalogs as Topic, DNA-Binding Proteins, Databases, Genetic, Eukaryota, Gene Duplication, Genome, Archaeal, Genome, Bacterial, Genomics, Protein Structure, Tertiary, Transcription Factors

Journal Title

Nucleic Acids Res

Conference Name

Journal ISSN

0305-1048
1362-4962

Volume Title

38

Publisher

Oxford University Press (OUP)