Abstract
Many essential cellular functions are carried out by multi-protein complexes that can be characterized by their protein-protein interactions. The interactions between protein subunits are critically dependent on the strengths of their interactions and their cellular abundances, both of which span orders of magnitude. Despite many efforts devoted to the global discovery of protein complexes by integrating large-scale protein abundance and interaction features, there is still room for improvement. Here, we integrated >7000 quantitative proteomic samples with three published affinity purification/co-fractionation mass spectrometry datasets into a deep learning framework to predict protein-protein interactions (PPIs), followed by the identification of protein complexes using a two-stage clustering strategy. Our deep-learning-technique-based classifier significantly outperformed recently published machine learning prediction models and in the process captured 5010 complexes containing over 9000 unique proteins. The vast majority of proteins in our predicted complexes exhibited low or no tissue specificity, which is an indication that the observed complexes tend to be ubiquitously expressed throughout all cell types and tissues. Interestingly, our combined approach increased the model sensitivity for low abundant proteins, which amongst other things allowed us to detect the interaction of MCM10, which connects to the replicative helicase complex via the MCM6 protein. The integration of protein abundances and their interaction features using a deep learning approach provided a comprehensive map of protein-protein interactions and a unique perspective on possible novel protein complexes.
Original language | English |
---|---|
Article number | 7884 |
Journal | International Journal of Molecular Sciences |
Volume | 24 |
Issue number | 9 |
DOIs | |
Publication status | Published - May 2023 |
Bibliographical note
Publisher Copyright:© 2023 by the authors.
Funding
This work has been supported by EPIC-XS, project number 823839, funded by the Horizon 2020 programme of the European Union and the NWO funded Netherlands Proteomics Centre through the National Road Map for Large-scale Infrastructures program X-Omics, Project 184.034.019. BL was supported by the China Scholarship Council (CSC) no. 201606300049.
Funders | Funder number |
---|---|
Netherlands Proteomics Centre | |
Horizon 2020 Framework Programme | |
European Proteomics Infrastructure Consortium providing access | 184.034.019, 823839 |
European Commission | |
Nederlandse Organisatie voor Wetenschappelijk Onderzoek | |
China Scholarship Council | 201606300049 |
Keywords
- data integration
- deep learning
- human protein–protein interaction
- mass spectrometry
- protein complexes
- proteomics