Abstract
Quantitative evaluation of binding affinity changes upon mutations is crucial for protein engineering and drug design. Machine learning-based methods are gaining increasing momentum in this field. Due to the limited number of experimental data, using a small number of sensitive predictive features is vital to the generalization and robustness of such machine learning methods. Here we introduce a fast and reliable predictor of binding affinity changes upon single point mutation, based on a random forest approach. Our method, iSEE, uses a limited number of interface Structure, Evolution and Energy-based features for the prediction. iSEE achieves, using only 31 features, a high prediction performance with a Pearson correlation coefficient (PCC) of 0.80 and a root mean square error of 1.41 kcal mol-1 on a diverse training dataset consisting of 1102 mutations in 57 protein-protein complexes. It competes with existing state-of-the-art methods on two blind test datasets. Predictions for a new dataset of 540 mutations in 58 protein complexes from the recently published SKEMPI 2.0 database reveals that none of the current methods perform well (PCC<0.4), although their combination does improve the predictions. Feature analysis for iSEE underlines the significance of evolutionary conservations for quantitative prediction of mutation effects. As an application example, we perform a full mutation scanning of the interface residues in the MDM2-p53 complex. This article is protected by copyright. All rights reserved.
Original language | English |
---|---|
Pages (from-to) | 110-119 |
Journal | Proteins: Structure, Function and Genetics |
Volume | 87 |
Issue number | 2 |
DOIs | |
Publication status | Published - Feb 2019 |
Keywords
- binding affinity
- full mutation scanning
- machine learning
- protein–protein interactions
- single point mutation