Skip to main content
Advertisement
  • Loading metrics

Estimation of the breadth of CD4bs targeting HIV antibodies by molecular modeling and machine learning

Abstract

HIV is a highly mutable virus for which all attempts to develop a vaccine have been unsuccessful. Nevertheless, few long-infected patients develop antibodies, called broadly neutralizing antibodies (bnAbs), that have a high breadth and can neutralize multiple variants of the virus. This suggests that a universal HIV vaccine should be possible. A measure of the efficacy of a HIV vaccine is the neutralization breadth of the antibodies it generates. The breadth is defined as the fraction of viruses in the Seaman panel that are neutralized by the antibody. Experimentally the neutralization ability is measured as the half maximal inhibitory concentration of the antibody (IC50). To avoid such time-consuming experimental measurements, we developed a computational approach to estimate the IC50 and use it to determine the antibody breadth. Given that no direct method exists for calculating IC50 values, we resort to a combination of atomistic modeling and machine learning. For each antibody/virus complex, an all-atoms model is built using the amino acid sequence and a known structure of a related complex. Then a series of descriptors are derived from the atomistic models, and these are used to train a Multi-Layer Perceptron (an Artificial Neural Network) to predict the value of the IC50 (by regression), or if the antibody binds or not to the virus (by classification). The neural networks are trained by use of experimental IC50 values collected in the CATNAP database. The computed breadths obtained by regression and classification are reported and the importance of having some related information in the data set for obtaining accurate predictions is analyzed. This approach is expected to prove useful for the design of HIV bnAbs, where the computation of the potency must be accompanied by a computation of the breadth, and for evaluating the efficiency of potential vaccination schemes developed through modeling and simulation.

Author summary

Although we are now almost 40 years into the AIDS epidemics, no approved vaccine for HIV exists. This is due to the high mutability of HIV, which allows it to escape the immune system control. Nevertheless, a few long-infected patients have been able to develop antibodies, called broadly neutralizing antibodies (bnAbs), that have a high neutralization breadth and are effective against a variety of viral strains. The knowledge that bnAbs can develop over time suggests that a HIV vaccine can be found that short-circuits the production of bnAbs. In this study we present a computational approach to estimate the breadth of HIV antibodies. Experimentally, the breadth is the fraction of viruses in a panel that are neutralized by the antibody, where the ability to neutralize is quantified as the half maximal inhibitory concentration of the antibody (IC50). A method to estimate the IC50 by computer modeling and machine learning is described and used for estimating the antibody breadth. This approach is likely to prove useful in the design of new antibodies effective against HIV and for testing the efficiency of theoretically-designed vaccination protocols.

Introduction

Vaccination is a medical procedure which has played an essential role in protecting mankind against viral and bacterial infections since the time of Edward Jenner, who developed a vaccine for smallpox over 200 years ago. Although for some diseases caused by viruses, such as measles, a small number of vaccinations provide almost permanent immunity, for other such as influenza, an annual revaccination, which provides only limited protection, is required. Since the measles virus, like the flu virus, undergoes error-prone replication that introduces mutations, it is not clear why the measles vaccination works as well as it does. Recent research [1] suggests that the measles virus remains antigenically monotypic because mutations are almost always lethal, though the reason for that is not known.

For HIV, which is the focus of this paper, no approved vaccine exists, although we are now almost forty years into the HIV/AIDS epidemic. As is the case for influenza virus, HIV evolves rapidly so that there exist many different viral strains. Some of them can evade the immune response to a vaccination directed against only a small number of strains. Thanks to the development of antiretroviral therapies [2], people who are infected by HIV can live essentially normal lives, without succumbing to AIDS. Several years after infection, a small fraction of the HIV infected individuals develop antibodies that are referred to as broadly neutralizing antibodies (bnAbs), i.e. antibodies that are effective against many strains of the virus [3]; an example of a detailed structural study of the germline and mature bnAbs from a single patient is given in Fera et al. [4]. The fact that the immune system can develop bnAbs over time has led to renewed interest in the possibility of developing a vaccine that would be effective against HIV [5,6].

The “neutralization breadth” of an antibody is a measure of its ability to recognize and neutralize different variants of the virus. An antibody with a high breadth can recognize effectively many different variants, while a low breadth antibody is more specific. Experimentally, the neutralization breadth of an antibody is evaluated by testing its ability to inhibit a panel of antigens from different clades of the target virus [7]. In what follows we simply write “breadth” when referring to “neutralization breadth”. For HIV, the Seaman virus panel contains 109 representatives of HIV clades A, B, C and circulating recombinant forms [7]. To evaluate the breadth of an antibody, the half maximal inhibitory concentration of the antibody (IC50) is measured for each virus in the panel. The breadth is defined as the fraction of viruses for which the IC50 is less than a given cutoff, usually set to 1 μg/ml.

The present paper describes a computational method to estimate the breadth of new HIV antibodies using only the sequence of their heavy and light chains, and the assumption that they form antibody/antigen complexes that are similar to a known crystal structure of an antibody/antigen complex that may not have high sequence homology. Of the known HIV antibodies, we select those that target the CD4 binding site (CD4bs) of the HIV Envelope glycoprotein, a binding site used by diverse bnAbs [5]. Because the CD4bs is highly conserved, an HIV vaccine designed to elicit bnAbs that bind to this site would have a high therapeutic potential: bnAbs would likely recognize the conserved core, as well as the variable regions in the neighborhood that are required for the broad-based character to develop [8].

However, a major obstacle to a successful computational design of bnAbs is a lack of accurate methods for computing the antibody breadth. Here, we use machine learning [9] coupled with descriptors obtained from all-atoms models of the antibody/antigen complex to predict the IC50 values for the viruses in the Seaman panel from which the antibody breadth can be estimated. To predict the IC50 we use a supervised artificial neural network [10,11], a multilayer perceptron (MLP); see Fig 1. The neural network is trained by backpropagation, where the errors in the outputs are propagated backwards into the artificial network structure to optimize the internal parameters of the model. An essential element for the successful application of artificial neural networks is the availability of a large data set for training the networks [11]. For HIV, experimental IC50 values have been collected in the CATNAP database of neutralizing antibodies [12], which contains more than 40000 IC50 values; of these, 6179 satisfy the criteria required for our machine learning approach.

thumbnail
Fig 1. Generic structure of an Artificial Neural Network.

It is composed of an input layer (the Xi nodes) that contains the descriptors developed for the system being studied, a hidden layer (Hi nodes) that combines non-linearly the descriptors in the input layer (through a logistic activation function), and an output layer (Y), that again combines non-linearly the nodes in the hidden layer and produces the output result.

https://doi.org/10.1371/journal.pcbi.1006954.g001

In the body of the paper, we describe training of the MLP to predict either the actual value of the IC50 (by regression), or whether or not the antibody recognizes the virus (by classification). The outputs of the two artificial neural networks are then used to estimate the breadth of known antibodies. In a Concluding Discussion, we consider the limitation and extension of the methodology described here, as well as its utility for vaccine design.

Results

Neural network regression model for the IC50

The first step toward the estimation of the breadth of a given antibody is to evaluate the IC50 for the binding of the antibody with all the antigens in the Seaman virus panel [7]. Here, the IC50 values are estimated using a Neural Network regression model (an MLP regressor), trained over experimental IC50 values. Experimental values are obtained from the CATNAP database, which contains a total of about 40000 IC50 values [12]. The values in the database were filtered to make sure that the amino acid sequences of both the antibody and antigen are available, that at least one crystallographic structure of the antibody bound to a related HIV antigen is known, and that the antibody binds to the CD4 binding site (CD4bs). Upon filtering the database for these attributes, a total of 6179 IC50 values were obtained. Of these, 3864 are reported as exact values, while the remaining 2315 are reported as “greater than” a given cutoff (e.g. “>50 μg/ml”); see Table 1. We refer to these as “approximated” values. To train the MLP regression model, only the 3864 exact values are used; the approximate ones were discarded.

thumbnail
Table 1. Antibodies studied in this work.

The second and third columns report the number of experimental IC50 values available as exact or approximate values. The fourth column reports the experimental breadth calculated based on the Seaman antigen panel. The number in parenthesis is the number of antigens used to calculate the breadth (the Seaman panel contains 109 antigens, but the experimental IC50 value is not available for all of them). The fifth column indicates the PDB ID used as the template for modeling the antibody/antigen complex. If only one PDB is indicated, that is used for the modeling “as is”. If two are present, the first is used to model the antibody, the second for the antigen (after best-fit RMSD superimposition to the antigen in the first PDB). The last three columns report the breadth calculated using the IC50 regressor with and without approximated values and using the IC50 classifier (see text).

https://doi.org/10.1371/journal.pcbi.1006954.t001

As inputs for the MLP regressor, we use the values of descriptors of the antibody/antigen bound system obtained from an atomistic model of the complex. A total of 24 descriptors were tested; they fall into four classes: atomic descriptors, protein/protein scoring functions, protein stability scoring functions, and entropy models (see Methods). The procedure for building the model for one antibody/antigen complex and evaluating all the descriptors requires about 13 minutes (see Methods); for evaluating the values of the subset of 19 chosen descriptors only about 20 seconds are required, once the model has been constructed (see Methods).

For training the MLP regressor the available experimental IC50 values are randomly divided in half to obtain a training set and a test set. The training is then performed, and the obtained neural network used to predict the IC50. The training is repeated 30 times using each time a randomly generated training and validation set, and the results are averaged over the 30 neural network results to estimate the error in the predictions.

The correlation between the experimental and computed IC50 is shown in Fig 2 for both the training and the test sets. The Pearson correlation coefficient for the training set is 0.686 (Spearman coefficient is 0.682). For the test set, the correlation coefficients decrease to 0.410 (Pearson) and 0.408 (Spearman). This is a significant improvement over the use of individual descriptors, which maximally reach a Pearson correlation coefficient of 0.28; see Fig 3 (last column).

thumbnail
Fig 2. Correlation between the computed and experimental pIC50 (= −logIC50).

Blue dots represent the set of data used to fit the MLP regressor, the red set represents the validation (or test) set of values. The Pearson correlation coefficient for training set is 0.686 (Spearman coefficient is 0.682), and it decreases to 0.410 (Spearman 0.408) for the validation set.

https://doi.org/10.1371/journal.pcbi.1006954.g002

thumbnail
Fig 3. Cross correlation between all descriptors and the experimental IC50 values.

The upper triangular reports the Pearson coefficients, the lower triangular the Spearman coefficients. The last column/row report the correlation with the experimental values.

https://doi.org/10.1371/journal.pcbi.1006954.g003

With these IC50 values, the breadth of known antibodies can be computed and compared with the experimental values. In the CATNAP database, 24 antibodies have a known sequence and bind to the CD4bs with a known crystallographic structure; see Table 1. The breadth is calculated for all of these using the computed IC50 values for the viruses in the Seaman panel. Specifically, for each virus in the Seaman panel, the IC50 is estimated using the MLP regressor, and the fraction of viruses for which the IC50 is less than 1 μg/ml is counted. Fig 4 (left) compares the experimental and computed breadths and their estimated errors (see Methods). For most antibodies the predicted breadth is in reasonable agreement with the experiments, although there is a tendency to overestimate it. The Pearson correlation coefficient is 0.800 (Spearman is 0.766) indicating a meaningful correlation. The breadth of the antibody CAP257-RH1 is significantly overestimated (experimental is 0.013, computed is 0.61±0.30), but this antibody is the one for which only a single exact IC50 value is available, and the computed uncertainty is highest.

thumbnail
Fig 4.

Comparison between experiments and predicted neutralization breadth for 24 antibodies evaluated using the IC50 values obtained via the MLP regressor without approximate values (left), the MLP regressor with approximate values (center), and the MLP classifier (right).

https://doi.org/10.1371/journal.pcbi.1006954.g004

Given the limited accuracy for the prediction of the IC50 values (Fig 2), this result is particularly encouraging. It indicates a low sensitivity of the breadth to the actual IC50 values. This is rooted in the definition of the breadth, which does not require the exact value of the IC50, but only whether the antibody does or does not bind to a given virus.

As a further test, all 6179 experimental values are used in the regression model including the “greater than” approximate values. For these, the maximum experimental value is used; e.g. if the IC50 is expressed as >10 μg/ml and >50 μg/ml in two different studies, the value 50 μg/ml is used. The obtained regression model shows an increase in accuracy, in particular an increase ability in ranking different antibodies (Spearman correlation coefficient increases from 0.766 to 0.960), but it has a loss in sensitivity for all low-breadth antibodies predicting zero breadth for most antibodies with experimental breadth less than 0.4; see Fig 4 (center.) As a result, although the accuracy of the slope is improved, the regression line is shifted to the right, yielding worse values.

Neural network classifier for the IC50

The breadth can be estimated more directly with an MLP classifier, which is trained to predict whether the IC50 is less or greater than the 1 μg/ml cutoff. A higher accuracy in the correlation between experiments and predictions is expected, because a classification is a simpler problem than a regression. The methodology used for the MLP classifier is the same as that employed for the MLP regressor, except that the whole set of 6179 experimental IC50 values is used; i.e., the 2351 “approximate” values are included. Table 1 lists the antibodies and the number of “exact” and “approximate” IC50 values available for each.

The correlation between the experimental and the computed breadth using the IC50 classifier is shown in Fig 4 (right). As expected, the results are significantly better than for the MLP regressor, with the correlation with the experimental breadth increasing from 0.800 for the regressor to 0.973 for the classifier. Moreover, no outliers are present and the estimated breadths of all 24 antibodies lie on the diagonal with near unit slope.

The performance of the MLP classifier is evaluated further by the analysis of its confusion matrix, which contains the percentage of true negative, true positive, false negative and false positive values. As reported in Table 2, the accuracy in the test set, corresponding to the sum of true positives and true negatives, is 72.3±1.0, with an almost equal rate of false positive and false negative of about 13%.

thumbnail
Table 2. Confusion matrix for the IC50 classifier.

The accuracy (ACC) is defined as the sum of the true positive (TP) and true negative (TN): ACC = TP+TN; the balanced accuracy (BACC) is defined as the average between the true positive divided by total positive samples (P) and true negative divided by total negative (N) samples: BACC = (TP/P + TN/N)/2.

https://doi.org/10.1371/journal.pcbi.1006954.t002

Test of MLP classifier on germline or immature antibodies

Although our primary purpose for developing the MLP classifier is to evaluate antibodies designed to have increasing breadth (see Concluding Discussion), we decided to determine whether it could be used to compute the breadth of putative germlines, as compared to the breadth of the mature antibodies. We consider bnAbs VRC01, NIH45-46 and 3BNC60, for which the experimental breadths of the mature antibodies are available, see Table 3. For the putative germlines no experimental breadths are available, but the expected breadth should be low, as they appears to have no affinity for native HIV antigens [1315].

thumbnail
Table 3. Comparison between calculated and experimental breadths for mature and germline forms of three antibodies.

The experimental breadth of the germline antibodies is assumed to be zero (no quantitative data are available). The intermediate VRC01 antibody is DRVIA7.

https://doi.org/10.1371/journal.pcbi.1006954.t003

The computed breadth values are reported in Table 3. For the mature antibodies the computed values are in good agreement with the experimental data. This is particularly interesting for the 3BNC60 antibody, for which no published structure of any antibody/antigen complex is currently available. For this reason, 3BNC60 was excluded from the training of the MLP regressor and classifier. However, a bound crystallographic structure is available for its putative germline precursor [16]. Assuming no significant change in the binding pose upon affinity maturation, the germline crystallographic structure was used for both the mature and germline antibody. This is likely to be a valid assumption, as indicated by the VRC01 and NIH45-46 antibodies for which crystallographic structures are available for both mature and germline forms bound to an antigen (see PDB 4JPK and 3NGB for VRC01 and 5IGX and 3U7Y for NIH45-46). The breadth of antibody 3BNC60 is slightly underestimated (0.53±0.21 vs 0.78), but it also has a higher than usual uncertainty.

For the putative germlines, the calculated breadths range between 0.27 and 0.45. Importantly, in all cases the breadth of the germline is significantly lower than the breadth of the mature antibody. This finding is encouraging, because only mature antibodies were used in the training of the MLP classifier, as no quantitative information for germline binding to HIV viruses is available in the CATNAP database. The MLP classifier is thus biased towards mature antibodies.

Finally, we consider the DRVIA7 antibody, which is an immature form of a VRC01-class antibody [17]. Few experimental IC50 values are available for this antibody; they show a breadth of about 0.23. The estimated value by modeling is 0.45±0.17, which is again higher than expected, but lower than the VRC01 mature value (0.79±0.08) and similar to the germline (0.45±0.23).

Analysis of descriptors used in the MLP classifier

An important aspect in the present application of machine learning for the prediction of antibody breadth is the choice of descriptors. The 24 descriptors considered can be broadly divided into four categories: atomic descriptors such as buried surface area or number of hydrogen bonds (15 descriptors in total), protein/protein binding affinity measures (4 descriptors: Prodigy, ZRANK, ZRANK2 and DFIRE), protein folding scoring functions (FoldX and two pairwise statistical potentials: RFHA and RFCB), and entropy models (normal mode entropy from two different elastic network models: ENM_EXP and ENM_R6); see Methods for more details. Their relative correlation coefficients and their correlation with the experimental values are reported in Fig 3. As expected, some descriptors are highly correlated with each other, such as the number of hydrogen bonds evaluated using different definitions, or the van der Waals energy with the buried solvent accessible surface area.

Selecting too few descriptors would risk making the model less general. Consequently, we kept a large number of descriptors, using as discriminant the computational cost needed to evaluate them. From the full set of 24 descriptors, 5 are initially removed due to their computational cost. The remaining 19 descriptors are all very quick to evaluate, about 20 seconds in total for one model, obtaining a 12.8x speedup in the calculation of the descriptors with respect to whole set of 24 descriptors. No decrease in accuracy is observed.

If too many descriptors are used, there is risk of overfitting. This is avoided here by the very large number of available experimental values. For the MLP classifier 6179 experimental values are available, half of which are used in the fitting. The MLP classifier with 19 descriptors and 10 hidden nodes contains 200 fitting parameters (see Methods), versus the 3089 (6179/2) experimental values used in the fitting. A ratio between the number of experimental values in the training and number of parameters in the model of ten or greater is suggested [11]; here the ratio is >15. As supporting evidence, Table 2 shows that the accuracy of the MLP classifier in the training and test sets is similar.

We also studied the effects of the number of descriptors used in the MLP classifier on the accuracy of the model as measured by the confusion matrix; see Fig 5. Each sample in these two plots is a different MLP classifier model fitted over a different set of descriptors. The first graph shows the accuracy of the given model, as measured by the confusion matrix as sum of the percentages of true positive and true negative, while the second one shows the Pearson correlation coefficients obtained when comparing the calculated and experimental breadths. The blue points correspond to models with randomly-chosen descriptors, while the red points represent models obtained by successively removing the descriptor that is most correlated with all other descriptors and less correlated with the experimental IC50 (see Methods). The accuracy starts to decrease significantly when the number of descriptors is less than about 10. The Pearson coefficient is more robust and starts decreasing at around 7 descriptors. At this point, the maximum cross-correlation (Pearson coefficient) between the descriptors is about 0.4. These data suggest that a simpler model of about 7–10 descriptors could be used, which would have (approximately) the same accuracy of the 19-descriptors model; see Table 4. As expected from the earlier analysis, the breadth results converge more rapidly than the accuracy.

thumbnail
Fig 5.

Accuracy (left) and correlation with the experimental breadth (right) obtained for MLP classifier trained with different sets of descriptors. On the abscissa is the number of descriptors used in each model. Blue points represent randomly generated models (selecting randomly between 1 and 24 descriptors among the 24 available), red points represent models obtained by systematic reduction of the full 19-descriptors model by successive removal of the most correlated descriptor (the 5 most computationally expensive descriptors are removed first to bring from 24 to 19 descriptors, and then the most correlated are removed).

https://doi.org/10.1371/journal.pcbi.1006954.g005

thumbnail
Table 4. Successive simplification of the 19-descriptors neural network model by elimination of the highest correlated descriptors.

The first column contains the number of descriptors in the current model. Columns two and three list the two descriptors with the highest pairwise correlation coefficient, which is reported in the fourth column. The correlations with the experimental IC50 values are reported next. The last column reports the descriptor which is deleted.

https://doi.org/10.1371/journal.pcbi.1006954.t004

This raises the question of which descriptors are most important in the prediction of the neutralization ability of an antibody. There are a variety of methods to extract descriptor importance from an artificial neural network, but there is no apparent consensus on which is better, in particular if the descriptors show a relative high level of cross correlation [18]. However, it is possible to extract some information from the correlation of the descriptors with the experimental IC50 values and with the deletion order reported in Table 4. The protein/protein scoring functions are on average more important than molecular descriptors, in particular the protein folding propensity descriptors (FoldX) and the two statistical pairwise potentials for estimating protein stability (RFHA and RFCB). Also, the entropy from elastic network models (ENM_EXP and ENM_R6) plays an important role, with ENM_EXP the last deleted descriptors with a relatively high correlation with the experiments (0.292 correlation coefficient). These highlight the importance of entropy in the binding, and the usefulness of complex protein potentials, with respect to simple molecular descriptors.

MLP classifier generalization ability

An issue in machine learning models is how good they are at generalizing to inputs outside their training set [11,19,20]. In computational studies about HIV neutralization epitopes it is relatively common to have one model trained for each antibody under study [2123]. By contrast, in this work only one model was trained to predict the neutralization breadth of all antibodies. We note that our training set did contain some experimental data from each antibody under study. This suggests the generalization ability of the model should be examined. Fig 6 shows how the predicted breadth of each antibody changes when an increasing number of experimental data relative to that antibody are included in the training set. At zero, no relevant experimental data are included; e.g. for the VRC01 antibody, at zero no experimental IC50 value of the VRC01 antibody is included in the training set. In these plots, the blue line is the breath computed using all available experimental data in the training set, while the red line is the experimental neutralization breadth. Comparing the plots for different antibodies, about half of them have a good breadth prediction even at zero, and the predicted breadth does not change significantly upon inclusion of more data; see for examples antibodies 1B2530, 8ANC131, 8ANC134, CH103, VRC01, VRC07, VRC-PG20. For other antibodies the prediction at zero is quite far from the experimental value; e.g. antibodies NIH45-46, CAP257-RH1, CH235.12, or N6. However, all these antibodies converge to the correct experimental value when a relatively small number of experimental values are included in the training set, usually between 10 and 20, relative to the total number of values available (see Table 1). The Pearson correlation coefficient between the computed and experimental breadth is 0.50 for the classifier with zero experimental data when all but 4 out of the 23 antibodies studied are included in the analysis (excluded antibodies are N6, CAP257-RH1, CH235.12 and NIH45-46, which have the highest error in the computed breadth). When the entire set of antibodies is considered, the correlation coefficient is 0.14; including just one experimental value per antibody, the Pearson coefficient increases to 0.54, and it reaches 0.90 when including only four experimental values. Another point to consider is that the good results obtained for some antibodies are due to their close similarity to other antibodies in the training set; e.g. the 8ANC131/8ANC134 pair and the VRC01/VRC07 pair. These results show that the neural network model is able to predict accurate values for the breadth in instances when a small number of experimental data for the antibody under study, or a close relative, are included in the training set.

thumbnail
Fig 6. Computed breadth for the 24 antibodies studied in this work as a function of the number of experimental IC50 values for that particular antibody used in the training of the MLP classifier.

The blue line is the computed breadth using all available experimental data in the training set, the red line is the experimental breadth. The graphs are shown up to 50 included experimental data, the total number of available experimental data is reported in Table 1.

https://doi.org/10.1371/journal.pcbi.1006954.g006

Comparison with other machine learning techniques

It is of interest to compare the results from MLP classifier used here with other machine learning techniques. We compared the Neural Network with one hidden layer used in this paper with a Neural Network with two hidden layers, k-nearest neighbors (kNN) [24], random forests (RF) [25,26], and Support Vector Machine (SVM) [27]; see Methods. All methods produce very similar results, in both the correlation with the experimental breadths and the estimated error in the single breadth values; see Table 5. This is supporting evidence of the robustness of the results obtained with the MLP. Moreover, it suggests that to improve the results new information about the system needs to be introduced, e.g. in the form of different descriptors.

thumbnail
Table 5. Comparison of the MLP classifier results with other machine learning techniques: MLP classifier with two hidden layers, Random Forests (RF with 31 trees), Supported Vector Machine (SVM with radial basis function kernel), and k-nearest neighbors (with 15 neighbors and weights assigned inversely proportional to the distance).

The reported data are the accuracy of the classifier (from the confusion matrix), and the Pearson, Spearman, slope and intercept for the correlation of the calculated and experimental breadths.

https://doi.org/10.1371/journal.pcbi.1006954.t005

Discussion

To the best of our knowledge, we report in this paper the first attempt to estimate the breadth of HIV antibodies by the use atomistic modeling and machine learning techniques. We developed a method based on a Multi-Layer Perceptron (an artificial neural network), which is able to accurately reproduce and predict the breadth of CD4bs targeting HIV antibodies.

For the development of such a model, many experimental IC50 values to train the neural networks have to be available. Here we used experimental IC50 values from the CATNAP database [12]. After cleaning and filtering the data, 6179 IC50 values were obtained for the training of the classifier and 3864 for the regressor. These significantly outnumber the number of parameters in the neural network (200), reducing the possibility of overfitting. For best results, a small number of experimental values specific for the antibody under study, or a related antibody, need to be present in the training set. For the application of the same techniques to different HIV binding sites or different viruses, such as influenza, sufficient high-quality experimental data would have to be collected and validated.

An important concern in any HIV model is the glycosylation of the virus since the HIV Envelope protein is heavily glycosylated. In the CD4bs, the focus of this work, there are at least five glycans (N197, N234, N276, N363, and N462) that can interact, and interfere, with the antibody binding. One glycan, N276, is particularly problematic in a comparison between germline and mature antibodies, as it has been observed that mature antibodies, like VRC01, introduce deletions or mutations to glycine in CDRL1 to avoid clashes and to accommodate the glycan [28,29]. This could be a major reason why the current model overestimates the breadth of putative germlines.

The tools presented here are important for the computational design and screening of potential HIV antibodies. The usual focus in antibody design is in the optimization of the potency and specificity for a particular antigen [30]. As pointed out in the Introduction, this is only one part of the problem for highly mutable virus like HIV, for which the exposed antigens have a high variability, so that bnAbs are required. Thus, it is important to couple the calculation of the potency with an estimation of the breadth.

The present technique for estimating antibody breadth based on IC50 values could prove useful in the computational design of vaccination protocols to elicit HIV bnAbs. Promising computational/theoretical approaches for the design of vaccination strategies include a description of the affinity maturation (AM) process [31,32]. This is achieved by simulating how antibodies mutate during the AM from the germline precursors to the mature antibodies. One essential step in these simulations is evaluating the breadth of the produced antibodies. Since the method described here requires as inputs only the sequence of the antibody and a template for the bound structure, it is likely to be a useful part of the design process. Moreover, the techniques developed here can, in principle, be applied to other highly variables viruses for which a definition of antibody breadth is important, such as influenza or hepatitis C viruses, if sufficient experimental neutralization data are available for training the models. From preliminary examination of the literature, this appears to be true at least for influenza.

An alternative perspective is to consider the evaluation of the breadth from the antibody/antigen binding affinity, see S1 Appendix. This requires an accurate, relatively rapid, method for the calculation of the binding affinity. Research on developing such a methodology is in progress [33]. The use of the “free energy of binding breadth”, rather than the “IC50 neutralizing breadth”, would, in principle, bypass the need for machine learning techniques.

Methods

First, the source of experimental IC50 values used in the training of the neural network is described. Then, the procedure to create an atomistic model of antibody/antigen complexes is presented, followed by the computation of all molecular descriptors derived from the atomistic models. The computational cost of creating the models and evaluating the descriptors is then reported. The final section is dedicated to the details of the neural network models.

Experimental IC50 database

To calculate the breadth of an antibody, the IC50 values for the antigens in the virus panel are required. In this work the experimental IC50 are used to both calculate the experimental breadth and to train a neural network (see below) to predict the IC50 values and breadth of new antibodies.

Experimental values of the IC50 for different antibody/antigen complexes were obtained from the CATNAP database [12] (accessed on March 5, 2018). The IC50 values are presented in the database in two forms. If available, the exact value is reported; if the complex has a low binding affinity, the exact value is not available, and the IC50 is reported to be greater than a given cutoff, e.g. “IC50>50μg/ml”. The full database contains information for 1024 HIV viruses and 334 antibodies, for a total of 23851 exact and 16467 approximate IC50 values (most antibodies are tested on different viruses). Of interest in this work are antibodies binding on the CD4bs. Because our method requires atomistic models of the antibody/antigen complex, IC50 values are selected such that the binding is at the CD4bs (based on known crystallographic structures of the bound antibody/antigen complex), the amino acid sequences of both the antibody and the antigen are known, and at least one crystallographic structure of the complex is experimentally available. After applying these filters, 24 antibodies (over the 334 available) are selected, for which a total of 3864 exact and 2315 approximate IC50 values are available; see Table 1.

Creating 3D atomistic models

The protocol described in this work requires building of atomistic models for arbitrary antibody/antigen complexes and computing molecular descriptors to train the neural network models. The first task is to build a full 3D atomistic model of the antibody/antigen complex given only the sequences of the two proteins. A requirement is to have a template of the complex, which is used for homology modeling. We select one template for each antibody, neglecting differences in the antigens. The underlying assumption is that the binding mode of a given antibody to any antigen is very similar and can be approximated to be constant. Moreover, the structure of the antigen is assumed to be conserved over the sequence space. This is not generally true for antigens with significant insertion or deletions in their (hyper)variable loops. Because we focus on the CD4bs, the assumption is reasonable. Crystallographic structures of antibody/antigen complexes were used as templates for the 24 antibodies. In most cases, the antibody was not bound to naturally-occurring antigens, but to engineered monomeric (gp120) domains. In these cases, the engineered protein was substituted with the BG505 SOSIP structure (PDB 5FYJ) using best-fit RMSD alignment for the superimposition. The PDB codes used for each antibody are reported in Table 1.

Given the sequences of the antigen and the antibody and a template crystallographic structure for the complex, Modeller [34] was used to create a model of the complex. The structure was then refined by CHARMM [35], including adding missing hydrogen atoms, creating disulfide bonds, and minimizing the all-atom structure by 100 steps steepest descent. The system was then further minimized using OpenMM [36] until the potential energy was changing by less than 1 KJ/mol. The CHARMM36 force field [37] was used for the all-atoms energy minimizations. In the modeling, only one monomer of the Env gp160 trimer was built, the gp41 deleted, and only the variable part of the antibody was kept. Glycosylation was not included.

Molecular descriptors

The optimized structure obtained by molecular modeling (see previous paragraph) was used for the computation of all descriptors to train the neural networks. 24 molecular descriptors were computed. These are divided in four main classes: atomic descriptors, protein/protein binding scoring functions, protein stability scoring, and entropy models.

As “atomic descriptors” we consider descriptors that are simple function of the atomic coordinates. The two most obvious are the electrostatic (Coulomb) interaction between the antibody and the antigen (E_elec) and the dispersive (Lennard-Jones) interaction (E_vdw). These are modeled according to the CHARMM36 force field [37] and computed using OpenMM [36]. To approximate the solvation of the protein, the GBn2 [38,39] generalized Born implicit model was used as third descriptor (E_gbsa). No cutoff in either electrostatic or dispersion force was used. Other used descriptors are the buried surface area [40] (MD_sasa) upon binding, the number of hydrogen bonds according to Baker & Hubbard [41] (MD_h1) and according to Wernet, Nilsson et al. [42] (MD_h2) definitions, and hydrogen bond energy according to Kabsch & Sander [43] (MD_h3). These four descriptors were computed using the MDTraj [44] python module. To these, the number of interchain contacts classified according to polarity and charge are added (a total of six descriptors: IC_AA, IC_PP, IC_CC, IC_AP, IC_CP, IC_AC), as well as the apolar and charged non-interacting surface area (NIS_A and NIS_C). These last eight descriptors were computed using Prodigy [45].

More complex descriptors are scoring functions developed for estimating the binding affinity of two proteins. These methods have been developed to score binding modes in protein/protein docking software. Here, they are not used as independent scoring functions (as originally intended), but as descriptors. The used methods are: Prodigy [45], ZRANK [46], ZRANKr [47,48], and DFIRE [49].

The third class of descriptors consists of scoring functions optimized to reproduce the folding propensity of a protein or its stability. With these scoring functions, the binding affinity can be estimated as the difference between the “stability” of the complex and that of the separated antibody and antigen. The methods tested are: FoldX [50,51] (sidechains are optimized), and two pairwise statistical potentials (RF_HA_SRS and RF_CB_SRS_OD) [52,53].

The fourth class of descriptors contains two methods used to estimate the entropy change upon binding. Two classical approaches to evaluate the entropy of a macromolecule are the use of the normal mode analysis (NMA) or quasi-harmonic (QHA) [54]. Neither of these can be used here directly: NMA requires that the structure is at an energy minimum, and QHA requires a large ensemble of structures. Alternatives are Elastic Network Models (ENM) [55,56]. In these models, the protein is modeled as a set of atoms interconnected by elastic springs, which vibrate around the given input structure. The total energy E of the system is obtained as sum of pairwise potentials, each acting between a pair of atoms whose distance is under a given cutoff: E = ∑ijk(d0)∙(dijd0)2. The sum is over all pairs of atoms ij, dij is the distance between them, d0 is the reference distance (from the input structure), and k(d0) is the force constant. k(d0) can take different expressions. The first developed used a constant for each pairs under a given distance cutoff [55]. More accurate models use force constants that depend on the reference distance d0. In this paper two expressions are tested. In the first (ENM_EXP) the force constant decreases exponentially with d0: , with r = 7Å [57]. In the second (ENM_R6), the force constant decreases proportionally to [58]. In both cases an arbitrary proportionality constant a has to be set. Here the value of 1000 is used (in arbitrary units). A 100-fold increase in the force constant caused no significant change in the results. From the energy expression, the Hessian matrix can be calculated, and, upon diagonalization of the mass-weighted Hessian matrix, the normal frequencies are obtained, and from them the harmonic entropy. These calculations are fast and do not require energy minimization as the input structure is considered to be the minimum.

A problem with most of the descriptors used in this work is their high sensitivity to the values of the atomic coordinates. To alleviate this problem, six models for each complex are generated with Modeller using random initial seeds, and the descriptors are averaged over the six models.

Timing

An important factor to consider when designing and choosing descriptors to use in neural networks and regressor models is the computational cost, or timing, needed to compute them. In this work, two are the main steps: first, generate the 3D atomistic model; second, compute all the descriptors.

As described above, preparation of the atomistic model consists of three steps. Modeller takes on average 69 seconds to generate one model. CHARMM, used to fix the structure and run a quick minimization takes an insignificant amount of time. A deeper energy minimization with OpenMM takes on average 37 seconds. A total of about 100 seconds are thus needed to generate one model.

The time required for each descriptor varies significantly. Most descriptors are computed almost instantaneously, requiring one second or less. There are three exceptions: evaluating the OpenMM-based descriptors (electrostatic, dispersive and generalized born energies) takes about 51 second per model, the ENM entropy takes 15 seconds, and FoldX is the slowest descriptor used, requiring 233 seconds on average. Computing all descriptors requires about 5 minutes. Skipping FoldX and OpenMM (the two most expensive steps) decreases the time from 5 minutes to 20 seconds.

These timings are for one single model. For each antibody/antigen complex six models are used, and the descriptors are averaged. This means that to create the six models 10.6 minutes are required, and to get all the descriptors, an additional 30.8 minutes are needed, for a total of 41 minutes. Without FoldX and OpenMM, the timing for the descriptors decreases to 2.4 minutes, reducing the total time to about 13 minutes.

Neural network models

To predict the IC50 values a Multi-Layer Perceptron regressor (MLP regressor) is used. This type of artificial neural network takes as input a number of descriptors, parse them in a hidden layer composed, in this case, of ten nodes, and outputs the predicted IC50 value. For the training, the 3864 exact IC50 values are split randomly in half into a training set and a validation set. The total number of parameters Np to fit in the MLP is given by Np = NiNh+NhNo where Ni, Nh, and No are the number of input descriptors, hidden nodes, and outputs. Using 19 descriptors, and requiring one output (the IC50 value), the total number of parameters to fit is 200. As an empirical rule, the ratio between the available experimental values used in the training and the number of parameters to fit has to be much greater than one. In our case, the number of experimental IC50 values used for the training is 3864/2, which is 9 times higher than the number of parameters to fit. A logistic activation function is used in the hidden nodes, and the lbfgs solver is used for optimization. Fig 3 shows the cross correlation between all descriptors and the experimental IC50 values.

A similar artificial neural network is used to predict whether the IC50 is less or greater than the experimentally-derived 1 μg/ml cutoff, which is used to calculate the antibody breadth. In this case, a Multi-Layer Perceptron classifier (MLP classifier) is trained instead of a regressor on the same set of descriptors. The same settings as for the MLP regressor were used, with the only difference that the output is filtered with a logistic function to be between 0 and 1. Here the total number of experimental values is 6179. Of these, 3089 are randomly selected and used to train the model. The ratio between the number of experimental values in the training (3089) and the number of parameters (200) is more than 15, decreasing the chance of overfitting. Table 6 compares the composition of the full dataset and a randomly generated set containing 50% of the values. All antibodies are represented in the training set, with the same percentages of binders and not-binders as in the whole set.

thumbnail
Table 6. Composition of the IC50 database used to train the IC50 classifier.

One set of data is available for each antibody: the line “All” indicates the statistics for the whole data set, “Random” for a randomly generated set of containing 50% of the data. Count is the number of complexes containing that antibody, with percentages relative to the full data set. “Bind” and “Not bind” are the numbers (percentages) of complexes for which binding and no binding is detected.

https://doi.org/10.1371/journal.pcbi.1006954.t006

It was observed that the outputs of both the MLP regressor and MLP classifier would change, in some cases significantly, upon repeated training using different random splitting of the experimental values into training and validation sets, and upon fitting the neural networks using different random seeds. For this reason, we average over 30 independently generated neural networks using each time a new splitting of the training and validation set and a new random seed in the training. The standard deviation between the 30 replicas is used as an estimate of the statistical error.

To select which descriptors to use in the neural network, the first criterium was to keep the scoring function as fast as possible, avoiding the more computationally expensive OpenMM energy-based terms and FoldX. Removing these four descriptors does not cause a decrease in accuracy, while reducing the required time by a factor of 12.8. Prodigy was also removed, as it is a linear combination of other descriptors (no effect is observed in the accuracy after removal). This way, the number of descriptors used decreases from 24 to 19 without loss in accuracy. To study the performance of simplified models including fewer descriptors, the models are sequentially simplified by removing the highest correlated descriptor. To do this, the cross-correlation matrix with all descriptors is analyzed to find the pair of descriptors with the highest correlation among themselves; see Fig 3. The descriptor with the lowest correlation with the experimental IC50 values is removed. The process is repeated until only one descriptor is present. The list of descriptors which are deleted at each step (and their correlation coefficients) are reported in Table 4. All descriptors are normalized to zero mean and unit variance (standard scaler).

To verify the robustness of the neural network, the same analyses were repeated using other machine learning methods: k-nearest neighbors (kNN) [24], random forests (RF) [25,26], and support vector machine (SVM) [27]. First, the MLP were compared with a second MLP containing two hidden layers instead of one. Having observed the same results, it was compared with a kNN (with distance weighting), a RF (composed of 31 trees) and a SVM (with radial basis function kernel) and no significant improvement was observed with any of them.

All machine learning methods (MLP, kNN, RF, SVM) and the standard scaler are used as implemented in the scikit-learn [59] python module.

Supporting information

S1 Appendix. Computation of the breadth from the antibody/antigen binding affinity.

https://doi.org/10.1371/journal.pcbi.1006954.s001

(PDF)

S1 Data. Python scripts and data to reproduce the figures and tables in the text.

https://doi.org/10.1371/journal.pcbi.1006954.s002

(TGZ)

Acknowledgments

We thank Victor Ovchinnikov for useful comments throughout this research and for careful reading of the manuscript.

References

  1. 1. Fulton BO, Sachs D, Beaty SM, Won ST, Lee B, Palese P, et al. Mutational Analysis of Measles Virus Suggests Constraints on Antigenic Variation of the Glycoproteins. Cell Reports. 2015;11(9):1331–8. pmid:26004185
  2. 2. Fauci AS, Folkers GK. Toward an AIDS-Free Generation. JAMA. 2012;308(4):343–4. pmid:22820783
  3. 3. Burton DR, Hangartner L. Broadly Neutralizing Antibodies to HIV and Their Role in Vaccine Design. Annual Review of Immunology. 2016;34(1):635–59.
  4. 4. Fera D, Schmidt AG, Haynes BF, Gao F, Liao H-X, Kepler TB, et al. Affinity maturation in an HIV broadly neutralizing B-cell lineage through reorientation of variable domains. Proceedings of the National Academy of Sciences. 2014;111(28):10275–80.
  5. 5. Kwong PD, Mascola JR. Human Antibodies that Neutralize HIV-1: Identification, Structures, and B Cell Ontogenies. Immunity. 2012;37(3):412–25. pmid:22999947
  6. 6. Kwong PD, Mascola JR, Nabel GJ. Broadly neutralizing antibodies and the search for an HIV-1 vaccine: the end of the beginning. Nature Reviews Immunology. 2013;13(9):nri3516.
  7. 7. Seaman MS, Janes H, Hawkins N, Grandpre LE, Devoy C, Giri A, et al. Tiered Categorization of a Diverse Panel of HIV-1 Env Pseudoviruses for Assessment of Neutralizing Antibodies. Journal of Virology. 2010;84(3):1439–52. pmid:19939925
  8. 8. Conti S, Kaczorowski KJ, Chakraborty AK, Karplus M. Design of immunogens to elicit CD4bs targeting broadly neutralizing antibodies against HIV. To be published.
  9. 9. Butler KT, Davies DW, Cartwright H, Isayev O, Walsh A. Machine learning for molecular and materials science. Nature. 2018;559(7715):547–55. pmid:30046072
  10. 10. Lippmann R. An introduction to computing with neural nets. IEEE ASSP Magazine. 1987;4(2):4–22.
  11. 11. Basheer IA, Hajmeer M. Artificial neural networks: fundamentals, computing, design, and application. Journal of Microbiological Methods. 2000;43(1):3–31. pmid:11084225
  12. 12. Yoon H, Macke J, West AP, Foley B, Bjorkman PJ, Korber B, et al. CATNAP: a tool to compile, analyze and tally neutralizing antibody panels. Nucleic Acids Research. 2015;43(W1):W213–9. pmid:26044712
  13. 13. Jardine J, Julien J-P, Menis S, Ota T, Kalyuzhniy O, McGuire A, et al. Rational HIV Immunogen Design to Target Specific Germline B Cell Receptors. Science. 2013;340(6133):711–6. pmid:23539181
  14. 14. Zhou T, Georgiev I, Wu X, Yang Z-Y, Dai K, Finzi A, et al. Structural Basis for Broad and Potent Neutralization of HIV-1 by Antibody VRC01. Science. 2010;329(5993):811–7. pmid:20616231
  15. 15. Scharf L, West AP, Gao H, Lee T, Scheid JF, Nussenzweig MC, et al. Structural basis for HIV-1 gp120 recognition by a germ-line version of a broadly neutralizing antibody. Proceedings of the National Academy of Sciences. 2013;110(15):6049–54.
  16. 16. Scharf L, Jr APW, Sievers SA, Chen C, Jiang S, Gao H, et al. Structural basis for germline antibody recognition of HIV-1 immunogens. eLife. 2016;5:e13783. pmid:26997349
  17. 17. Kong L, Ju B, Chen Y, He L, Ren L, Liu J, et al. Key gp120 Glycans Pose Roadblocks to the Rapid Development of VRC01-Class Antibodies in an HIV-1-Infected Chinese Donor. Immunity. 2016;44(4):939–50. pmid:27067056
  18. 18. Gevrey M, Dimopoulos I, Lek S. Review and comparison of methods to study the contribution of variables in artificial neural network models. Ecological Modelling. 2003;160(3):249–64.
  19. 19. Jain AK, Mao J, Mohiuddin KM. Artificial neural networks: a tutorial. Computer. 1996;29(3):31–44.
  20. 20. Rumelhart DE, Widrow B, Lehr MA. The basic ideas in neural networks. Communications of the ACM. 1994;37(3):87.
  21. 21. Ferguson AL, Falkowska E, Walker LM, Seaman MS, Burton DR, Chakraborty AK. Computational Prediction of Broadly Neutralizing HIV-1 Antibody Epitopes from Neutralization Activity Data. PLOS ONE. 2013;8(12):e80562. pmid:24312481
  22. 22. West AP, Scharf L, Horwitz J, Klein F, Nussenzweig MC, Bjorkman PJ. Computational analysis of anti–HIV-1 antibody neutralization panel data to identify potential functional epitope residues. Proceedings of the National Academy of Sciences. 2013;110(26):10598–603.
  23. 23. Hake A, Pfeifer N. Prediction of HIV-1 sensitivity to broadly neutralizing antibodies shows a trend towards resistance over time. PLOS Computational Biology. 2017;13(10):e1005789. pmid:29065122
  24. 24. Altman NS. An Introduction to Kernel and Nearest-Neighbor Nonparametric Regression. The American Statistician. 1992;46(3):175–85.
  25. 25. Ho TK. Random decision forests. In: Proceedings of 3rd International Conference on Document Analysis and Recognition. 1995. p. 278–82.
  26. 26. Svetnik V, Liaw A, Tong C, Culberson JC, Sheridan RP, Feuston BP. Random Forest:  A Classification and Regression Tool for Compound Classification and QSAR Modeling. Journal of Chemical Information and Computer Sciences. 2003;43(6):1947–58. pmid:14632445
  27. 27. Cortes C, Vapnik V. Support-vector networks. Machine Learning. 1995;20(3):273–97.
  28. 28. Jardine JG, Sok D, Julien J-P, Briney B, Sarkar A, Liang C-H, et al. Minimally Mutated HIV-1 Broadly Neutralizing Antibodies to Guide Reductionist Vaccine Design. PLOS Pathogens. 2016;12(8):e1005815. pmid:27560183
  29. 29. Zhou T, Zhu J, Wu X, Moquin S, Zhang B, Acharya P, et al. Multidonor Analysis Reveals Structural Elements, Genetic Determinants, and Maturation Pathway for HIV-1 Neutralization by VRC01-Class Antibodies. Immunity. 2013;39(2):245–58. pmid:23911655
  30. 30. Tiller KE, Tessier PM. Advances in Antibody Design. Annual Review of Biomedical Engineering. 2015;17(1):191–216.
  31. 31. Wang S, Mata-Fink J, Kriegsman B, Hanson M, Irvine DJ, Eisen HN, et al. Manipulating the Selection Forces during Affinity Maturation to Generate Cross-Reactive HIV Antibodies. Cell. 2015;160(4):785–97. pmid:25662010
  32. 32. Ovchinnikov V, Louveau JE, Barton JP, Karplus M, Chakraborty AK. Role of framework mutations and antibody flexibility in the evolution of broadly neutralizing antibodies. eLife. 2018;
  33. 33. Ovchinnikov V, Nam K, Karplus M. A Simple and Accurate Method To Calculate Free Energy Profiles and Reaction Rates from Restrained Molecular Simulations of Diffusive Processes. The Journal of Physical Chemistry B. 2016;120(33):8457–72. pmid:27135391
  34. 34. Webb B, Sali A. Comparative Protein Structure Modeling Using MODELLER. In: Current Protocols in Bioinformatics. John Wiley & Sons, Inc.; 2002.
  35. 35. Brooks BR, Bruccoleri RE, Olafson BD, States DJ, Swaminathan S, Karplus M. CHARMM: A program for macromolecular energy, minimization, and dynamics calculations. Journal of Computational Chemistry. 1983;4(2):187–217.
  36. 36. Eastman P, Swails J, Chodera JD, McGibbon RT, Zhao Y, Beauchamp KA, et al. OpenMM 7: Rapid development of high performance algorithms for molecular dynamics. PLOS Computational Biology. 2017;13(7):e1005659. pmid:28746339
  37. 37. Best RB, Zhu X, Shim J, Lopes PEM, Mittal J, Feig M, et al. Optimization of the Additive CHARMM All-Atom Protein Force Field Targeting Improved Sampling of the Backbone ϕ, ψ and Side-Chain χ1 and χ2 Dihedral Angles. Journal of Chemical Theory and Computation. 2012;8(9):3257–73. pmid:23341755
  38. 38. Mongan J, Simmerling C, McCammon JA, Case DA, Onufriev A. Generalized Born Model with a Simple, Robust Molecular Volume Correction. Journal of Chemical Theory and Computation. 2007;3(1):156–69. pmid:21072141
  39. 39. Nguyen H, Roe DR, Simmerling C. Improved Generalized Born Solvent Model Parameters for Protein Simulations. Journal of Chemical Theory and Computation. 2013;9(4):2020–34. pmid:25788871
  40. 40. Shrake A, Rupley JA. Environment and exposure to solvent of protein atoms. Lysozyme and insulin. Journal of Molecular Biology. 1973;79(2):351–71. pmid:4760134
  41. 41. Baker EN, Hubbard RE. Hydrogen bonding in globular proteins. Progress in Biophysics and Molecular Biology. 1984;44(2):97–179. pmid:6385134
  42. 42. Wernet P, Nordlund D, Bergmann U, Cavalleri M, Odelius M, Ogasawara H, et al. The Structure of the First Coordination Shell in Liquid Water. Science. 2004;304(5673):995–9. pmid:15060287
  43. 43. Kabsch W, Sander C. Dictionary of protein secondary structure: Pattern recognition of hydrogen-bonded and geometrical features. Biopolymers. 1983;22(12):2577–637. pmid:6667333
  44. 44. McGibbon RT, Beauchamp KA, Harrigan MP, Klein C, Swails JM, Hernández CX, et al. MDTraj: A Modern Open Library for the Analysis of Molecular Dynamics Trajectories. Biophysical Journal. 2015;109(8):1528–32. pmid:26488642
  45. 45. Vangone A, Bonvin AM. Contacts-based prediction of binding affinity in protein–protein complexes. eLife. 2015;4:e07454. pmid:26193119
  46. 46. Pierce B, Weng Z. ZRANK: Reranking protein docking predictions with an optimized energy function. Proteins: Structure, Function, and Bioinformatics. 2007;67(4):1078–86.
  47. 47. Pierce B, Weng Z. A combination of rescoring and refinement significantly improves protein docking performance. Proteins: Structure, Function, and Bioinformatics. 2008;72(1):270–9.
  48. 48. Mintseris J, Pierce B, Wiehe K, Anderson R, Chen R, Weng Z. Integrating statistical pair potentials into protein complex prediction. Proteins: Structure, Function, and Bioinformatics. 2007;69(3):511–20.
  49. 49. Liu S, Zhang C, Zhou H, Zhou Y. A physical reference state unifies the structure-derived potential of mean force for protein folding and binding. Proteins: Structure, Function, and Bioinformatics. 2004;56(1):93–101.
  50. 50. Guerois R, Nielsen JE, Serrano L. Predicting Changes in the Stability of Proteins and Protein Complexes: A Study of More Than 1000 Mutations. Journal of Molecular Biology. 2002;320(2):369–87. pmid:12079393
  51. 51. Schymkowitz J, Borg J, Stricher F, Nys R, Rousseau F, Serrano L. The FoldX web server: an online force field. Nucleic Acids Research. 2005;33:W382–8. pmid:15980494
  52. 52. Rykunov D, Fiser A. Effects of amino acid composition, finite size of proteins, and sparse statistics on distance-dependent statistical pair potentials. Proteins: Structure, Function, and Bioinformatics. 2007;67(3):559–68.
  53. 53. Rykunov D, Fiser A. New statistical potential for quality assessment of protein models and a survey of energy functions. BMC Bioinformatics. 2010;11:128. pmid:20226048
  54. 54. Brooks BR, Dušanka Janežič, Karplus M. Harmonic analysis of large systems. I. Methodology. Journal of Computational Chemistry. 1995;16(12):1522–42.
  55. 55. Tirion MM. Large Amplitude Elastic Motions in Proteins from a Single-Parameter, Atomic Analysis. Physical Review Letters. 1996;77(9):1905–8. pmid:10063201
  56. 56. Hinsen K. Analysis of domain motions by approximate normal mode calculations. Proteins: Structure, Function, and Bioinformatics. 1998;33(3):417–29.
  57. 57. Atilgan AR, Durell SR, Jernigan RL, Demirel MC, Keskin O, Bahar I. Anisotropy of Fluctuation Dynamics of Proteins with an Elastic Network Model. Biophysical Journal. 2001;80(1):505–15. pmid:11159421
  58. 58. Hinsen K, Petrescu A-J, Dellerue S, Bellissent-Funel M-C, Kneller GR. Harmonicity in slow protein dynamics. Chemical Physics. 2000;261(1):25–37.
  59. 59. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, et al. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research. 2011;12:2825–2830.