Repository logo
 

Reciprocal best structure hits: using AlphaFold models to discover distant homologues.

Accepted version
Peer-reviewed

Type

Article

Change log

Authors

Monzon, Vivian 
Paysan-Lafosse, Typhaine 
Bateman, Alex 

Abstract

MOTIVATION: The conventional methods to detect homologous protein pairs use the comparison of protein sequences. But the sequences of two homologous proteins may diverge significantly and consequently may be undetectable by standard approaches. The release of the AlphaFold 2.0 software enables the prediction of highly accurate protein structures and opens many opportunities to advance our understanding of protein functions, including the detection of homologous protein structure pairs. RESULTS: In this proof-of-concept work, we search for the closest homologous protein pairs using the structure models of five model organisms from the AlphaFold database. We compare the results with homologous protein pairs detected by their sequence similarity and show that the structural matching approach finds a similar set of results. In addition, we detect potential novel homologs solely with the structural matching approach, which can help to understand the function of uncharacterized proteins and make previously overlooked connections between well-characterized proteins. We also observe limitations of our implementation of the structure-based approach, particularly when handling highly disordered proteins or short protein structures. Our work shows that high accuracy protein structure models can be used to discover homologous protein pairs, and we expose areas for improvement of this structural matching approach. AVAILABILITY AND IMPLEMENTATION: Information to the discovered homologous protein pairs can be found at the following URL: https://doi.org/10.17863/CAM.87873. The code can be accessed here: https://github.com/VivianMonzon/Reciprocal_Best_Structure_Hits. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics Advances online.

Description

Keywords

3101 Biochemistry and Cell Biology, 3102 Bioinformatics and Computational Biology, 31 Biological Sciences, 1 Underpinning research, 1.1 Normal biological development and functioning, Generic health relevance

Journal Title

Bioinform Adv

Conference Name

Journal ISSN

2635-0041
2635-0041

Volume Title

Publisher

Oxford University Press (OUP)
Sponsorship
Wellcome Trust (218236/Z/19/Z)
Relationships
Is supplemented by: