Abstract
This thesis introduces and showcases novel approaches for explorative and integrative modeling in the presence of cryo-EM data and distance restraints. In it the PowerFit software is presented, a Python package for fast cross correlation based rigid body fitting of high-resolution structures in low-resolution densities. PowerFit comes with a new more sensitive scoring function, the core-weighted local cross correlation, in addition to an optimized protocol for fast fitting. Subsequently, I report results of an extensive benchmark of the PowerFit software using 379 subunits of 5 ribosome density maps. The success rate of unambiguously fitting subunits larger than 100 residues reached approximately 90% up to 12Å resolution, showing that objective fitting methods have matured to usable aids in structural modeling. The limits of rigid body fitting can be leveraged through the use of image pyramids to gain a speedup of a factor of 30 on CPUs and 40 on GPUs, and it allows the identification of possible over-interpreted regions of the density on an objective basis.
I also describe the incorporation and benchmarking of cryo- EM data into the data-driven docking program HADDOCK. The approach is flexible and can be fully combined with other available sources of data in HADDOCK, making it a fully integrative modeling approach. It was demonstrated on two ribosome systems, two virus-antibody systems, and a symmetric pentamer. An update of the HADDOCK web server is presented afterwards, together with extensive usage statistics of the software all over the world.
Next I deal with explorative modeling using distance restraints in general, and cross-link data specifically. I introduce the concept of the accessible interaction space and present a method to quantify and visualize it. This directly indicates the information content of distance restraints and shows whether all data are self-consistent and, if not, it gives an indication of which restraint is a false-positive. This is implemented in another Python package, DisVis. The approach is general and can easily be incorporated into FFT-based docking programs allow- ing the use of distance restraints by combining the ’marriage made in heaven’ of sampling and scoring.
I extended this approach further, by presenting a method to infer interface residues from distance restraints using the concept of the average-interactions-per-complex (AIC) statistic. The AIC provides an objective probability for a residue to be at the interface based on the available data. Furthermore, I benchmarked the use of cross-link based distance restraints in HADDOCK using four different approaches. My results show that using solely unambiguous distance restraints is subop- timal; instead they should either be complemented with center-of-mass restraints or DisVis-based ambiguous interactions restraints.
By showcasing integrative modeling approaches and introducing new methods for quantifying the information content of experimental data this thesis lays out some new building blocks for the field to build upon and move forward.
I also describe the incorporation and benchmarking of cryo- EM data into the data-driven docking program HADDOCK. The approach is flexible and can be fully combined with other available sources of data in HADDOCK, making it a fully integrative modeling approach. It was demonstrated on two ribosome systems, two virus-antibody systems, and a symmetric pentamer. An update of the HADDOCK web server is presented afterwards, together with extensive usage statistics of the software all over the world.
Next I deal with explorative modeling using distance restraints in general, and cross-link data specifically. I introduce the concept of the accessible interaction space and present a method to quantify and visualize it. This directly indicates the information content of distance restraints and shows whether all data are self-consistent and, if not, it gives an indication of which restraint is a false-positive. This is implemented in another Python package, DisVis. The approach is general and can easily be incorporated into FFT-based docking programs allow- ing the use of distance restraints by combining the ’marriage made in heaven’ of sampling and scoring.
I extended this approach further, by presenting a method to infer interface residues from distance restraints using the concept of the average-interactions-per-complex (AIC) statistic. The AIC provides an objective probability for a residue to be at the interface based on the available data. Furthermore, I benchmarked the use of cross-link based distance restraints in HADDOCK using four different approaches. My results show that using solely unambiguous distance restraints is subop- timal; instead they should either be complemented with center-of-mass restraints or DisVis-based ambiguous interactions restraints.
By showcasing integrative modeling approaches and introducing new methods for quantifying the information content of experimental data this thesis lays out some new building blocks for the field to build upon and move forward.
Original language | English |
---|---|
Awarding Institution |
|
Supervisors/Advisors |
|
Award date | 25 Nov 2015 |
Publisher | |
Print ISBNs | 987-90-393-6449-9 |
Publication status | Published - 25 Nov 2015 |
Keywords
- Cryo-electron microscopy
- chemical cross-linking
- macromolecular docking
- rigidbody fitting
- GPU computing