Abstract
In structure determination by X-ray crystallography and solution NMR spectroscopy experimental data are collected as time and ensemble-averages. Thus, in principle, appropriate time and ensemble-averaged models should be used. Refinement of an ensemble of conformers rather than one single structure against the experimental NMR data could, however, result in overfitting the data because of the significantly increased number of parameters. To avoid overfitting, complete cross-validation, which provides an unbiased measure of the fit, has been applied to nuclear Overhauser effect derived distance refinement. Using two synthetic test cases, a correlation was demonstrated between the cross-validated measure to the fit (defined in terms of root-mean-square deviations from the distance restraints and number of violations) and the number of models that best reproduce the conformational variability in solution. A new method, based on a probability map, has been used to generate good representations of the resulting ensembles of structures. The method has also been applied to observed NMR data for two proteins, interleukin 4 and interleukin 8. For interleukin 4, cross-validation indicates that a single-conformer model gives the most accurate representation of the structure, whereas conventional measures of fit between the experimental data and those calculated from the model decrease when increasing the number of conformers, indicating overfitting. For interleukin 8, complete cross-validation predicts a twin-conformer model to be the most faithful representation of the experimental data. Two distinct conformations for the loop formed by residues 16 to 22 emerge from the family of twin-conformer structures. The putative alternate conformation of the loop is not observed in the crystal structure of interleukin 8. However, because of crystal packing contacts in this region this does not necessarily exclude the presence of the alternate conformation in solution. The twin-conformer model is supported by observed chemical exchange line broadening for the amide of His18 obtained by 15N relaxation studies. This region has also been implied to be involved in receptor binding.
Original language | English |
---|---|
Pages (from-to) | 80-93 |
Number of pages | 14 |
Journal | Journal of Molecular Biology |
Volume | 250 |
Issue number | 1 |
DOIs | |
Publication status | Published - 1995 |
Externally published | Yes |
Bibliographical note
Funding Information:The authors thank Greg Warren for providing the Amb t V coordinates, E. T. Baldwin (Upjohn Company) and A. Wlodawer (National Cancer Institute-Frederick Cancer Research and Development Center) for the IL-8 X-ray diffraction data and Erin Duffy, Luke Rice, Greg Warren and Paul Adams for careful reading of the manuscript. A.M.J.J.B. thanks the Swiss National Foundation for Scientific Research for financial support.
Funding
The authors thank Greg Warren for providing the Amb t V coordinates, E. T. Baldwin (Upjohn Company) and A. Wlodawer (National Cancer Institute-Frederick Cancer Research and Development Center) for the IL-8 X-ray diffraction data and Erin Duffy, Luke Rice, Greg Warren and Paul Adams for careful reading of the manuscript. A.M.J.J.B. thanks the Swiss National Foundation for Scientific Research for financial support.
Keywords
- Complete cross-validation
- Conformational variability
- Ensemble-averaged NOE restraints
- Multi-conformer refinement
- Solution structure