Identification of Protein Complexes by Integrating Protein Abundance and Interaction Features Using a Deep Learning Strategy

Bohui Li, Maarten Altelaar, Bas van Breukelen*

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

Abstract

Many essential cellular functions are carried out by multi-protein complexes that can be characterized by their protein-protein interactions. The interactions between protein subunits are critically dependent on the strengths of their interactions and their cellular abundances, both of which span orders of magnitude. Despite many efforts devoted to the global discovery of protein complexes by integrating large-scale protein abundance and interaction features, there is still room for improvement. Here, we integrated >7000 quantitative proteomic samples with three published affinity purification/co-fractionation mass spectrometry datasets into a deep learning framework to predict protein-protein interactions (PPIs), followed by the identification of protein complexes using a two-stage clustering strategy. Our deep-learning-technique-based classifier significantly outperformed recently published machine learning prediction models and in the process captured 5010 complexes containing over 9000 unique proteins. The vast majority of proteins in our predicted complexes exhibited low or no tissue specificity, which is an indication that the observed complexes tend to be ubiquitously expressed throughout all cell types and tissues. Interestingly, our combined approach increased the model sensitivity for low abundant proteins, which amongst other things allowed us to detect the interaction of MCM10, which connects to the replicative helicase complex via the MCM6 protein. The integration of protein abundances and their interaction features using a deep learning approach provided a comprehensive map of protein-protein interactions and a unique perspective on possible novel protein complexes.

Original languageEnglish
Article number7884
JournalInternational Journal of Molecular Sciences
Volume24
Issue number9
DOIs
Publication statusPublished - May 2023

Bibliographical note

Publisher Copyright:
© 2023 by the authors.

Funding

This work has been supported by EPIC-XS, project number 823839, funded by the Horizon 2020 programme of the European Union and the NWO funded Netherlands Proteomics Centre through the National Road Map for Large-scale Infrastructures program X-Omics, Project 184.034.019. BL was supported by the China Scholarship Council (CSC) no. 201606300049.

FundersFunder number
Netherlands Proteomics Centre
Horizon 2020 Framework Programme
European Proteomics Infrastructure Consortium providing access184.034.019, 823839
European Commission
Nederlandse Organisatie voor Wetenschappelijk Onderzoek
China Scholarship Council201606300049

    Keywords

    • data integration
    • deep learning
    • human protein–protein interaction
    • mass spectrometry
    • protein complexes
    • proteomics

    Fingerprint

    Dive into the research topics of 'Identification of Protein Complexes by Integrating Protein Abundance and Interaction Features Using a Deep Learning Strategy'. Together they form a unique fingerprint.

    Cite this