Repository logo
 

A method for partitioning the information contained in a protein sequence between its structure and function.

Accepted version
Peer-reviewed

Type

Article

Change log

Authors

Possenti, Andrea 
Vendruscolo, Michele 
Camilloni, Carlo 

Abstract

Proteins employ the information stored in the genetic code and translated into their sequences to carry out well-defined functions in the cellular environment. The possibility to encode for such functions is controlled by the balance between the amount of information supplied by the sequence and that left after that the protein has folded into its structure. We study the amount of information necessary to specify the protein structure, providing an estimate that keeps into account the thermodynamic properties of protein folding. We thus show that the information remaining in the protein sequence after encoding for its structure (the 'information gap') is very close to what needed to encode for its function and interactions. Then, by predicting the information gap directly from the protein sequence, we show that it may be possible to use these insights from information theory to discriminate between ordered and disordered proteins, to identify unknown functions, and to optimize artificially-designed protein sequences.

Description

Keywords

designed proteins, information content, intrinsically disordered proteins, protein folding/function, structure prediction, Amino Acid Sequence, Computational Biology, Models, Molecular, Protein Conformation, Protein Folding, Proteins, Thermodynamics

Journal Title

Proteins

Conference Name

Journal ISSN

0887-3585
1097-0134

Volume Title

86

Publisher

Wiley

Rights

All rights reserved