Prediction of impacts of mutations on protein structure and interactions: SDM, a statistical approach, and mCSM, using machine learning.
Protein science : a publication of the Protein Society
MetadataShow full item record
Pandurangan, A. P., & Blundell, T. (2020). Prediction of impacts of mutations on protein structure and interactions: SDM, a statistical approach, and mCSM, using machine learning.. Protein science : a publication of the Protein Society, 29 (1), 247-257. https://doi.org/10.1002/pro.3774
Next‐generation sequencing methods have not only allowed an understanding of genome sequence variation during the evolution of organisms, but also provided invaluable information about genetic variants in inherited disease and the emergence of resistance to drugs in cancers and infectious disease. A challenge is to distinguish mutations that are drivers of disease or drug resistance, from passengers that are neutral or even selectively advantageous to the organism. This requires understanding of impacts of missense mutations in gene expression and regulation, and on disruption of protein function by modulating protein stability or disturbing interactions with proteins, nucleic acids, small molecule ligands and other biological molecules. Experimental approaches to understanding differences between wild‐type and mutant proteins are most accurate, but are also time consuming and costly. Computational tools used to predict impacts of mutations can provide useful information more quickly. Here we focus on two widely‐used structure‐based approaches, originally developed in the Blundell lab: site‐directed mutator SDM, a statistical approach to analysing amino acid substitutions, and mCSM, which uses graph‐based signatures to represent the wild‐type structural environment and machine learning to predict the effect of mutations on protein stability. Here we describe DUET which uses machine learning to combine the two approaches. We discuss briefly the development of mCSM for understanding the impacts of mutations on interfaces with other proteins, nucleic acids and ligands, and we exemplify the wide application of these approaches to understand human genetic disorders and drug resistance mutations relevant to cancer and mycobacterial infections.
Humans, Genetic Predisposition to Disease, Proteins, Sequence Analysis, DNA, Computational Biology, Protein Conformation, Protein Binding, Drug Resistance, Mutation, Models, Molecular, Software, Protein Stability, High-Throughput Nucleotide Sequencing, Machine Learning
Bill & Melinda Gates Foundation (via Foundation for the National Institutes of Health (FNIH)) (ABELL11HTB0)
WELLCOME TRUST (200814/Z/16/Z)
Medical Research Council (MR/N501864/1)
EC FP7 CP (260872)
External DOI: https://doi.org/10.1002/pro.3774
This record's URL: https://www.repository.cam.ac.uk/handle/1810/299390
All rights reserved