Repository logo
 

The (in)dependence of alternative splicing and gene duplication.

Published version
Peer-reviewed

Type

Article

Change log

Authors

Talavera, David 
Vogel, Christine 
Orozco, Modesto 
Teichmann, Sarah A 
de la Cruz, Xavier 

Abstract

Alternative splicing (AS) and gene duplication (GD) both are processes that diversify the protein repertoire. Recent examples have shown that sequence changes introduced by AS may be comparable to those introduced by GD. In addition, the two processes are inversely correlated at the genomic scale: large gene families are depleted in splice variants and vice versa. All together, these data strongly suggest that both phenomena result in interchangeability between their effects. Here, we tested the extent to which this applies with respect to various protein characteristics. The amounts of AS and GD per gene are anticorrelated even when accounting for different gene functions or degrees of sequence divergence. In contrast, the two processes appear to be independent in their influence on variation in mRNA expression. Further, we conducted a detailed comparison of the effect of sequence changes in both alternative splice variants and gene duplicates on protein structure, in particular the size, location, and types of sequence substitutions and insertions/deletions. We find that, in general, alternative splicing affects protein sequence and structure in a more drastic way than gene duplication and subsequent divergence. Our results reveal an interesting paradox between the anticorrelation of AS and GD at the genomic level, and their impact at the protein level, which shows little or no equivalence in terms of effects on protein sequence, structure, and function. We discuss possible explanations that relate to the order of appearance of AS and GD in a gene family, and to the selection pressure imposed by the environment.

Description

Keywords

Alternative Splicing, Base Sequence, Computer Simulation, DNA Mutational Analysis, Evolution, Molecular, Gene Duplication, Genetic Variation, Models, Genetic, Molecular Sequence Data, Proteome, Sequence Analysis, DNA

Journal Title

PLoS Comput Biol

Conference Name

Journal ISSN

1553-734X
1553-7358

Volume Title

3

Publisher

Public Library of Science (PLoS)