Fully automated antibody structure prediction using BIOVIA tools: Validation study

Helen Kemmish; Marc Fasnacht; Lisa Yan

doi:10.1371/journal.pone.0177923

Abstract

We describe the methodology and results from our validation study of the fully automated antibody structure prediction tool available in the BIOVIA (formerly Accelrys) protein modeling suite. Extending our previous study, we have validated the automated approach using a larger and more diverse data set (157 unique antibody Fv domains versus 11 in the previous study). In the current study, we explore the effect of varying several parameter settings in order to better understand their influence on the resulting model quality. Specifically, we investigated the dependence on different methods of framework model construction, antibody numbering schemes (Chothia, IMGT, Honegger and Kabat), the influence of compatibility of loop templates using canonical type filtering, wider exploration of model solution space, and others. Our results show that our recently introduced Top5 framework modeling method results in a small but significant improvement in model quality whereas the effect of other parameters is not significant. Our analysis provides improved guidelines of best practices for using our protocol to build antibody structures. We also identify some limitations of the current computational model which will enhance proper evaluation of model quality by users and suggests possible future enhancements.

Citation: Kemmish H, Fasnacht M, Yan L (2017) Fully automated antibody structure prediction using BIOVIA tools: Validation study. PLoS ONE 12(5): e0177923. https://doi.org/10.1371/journal.pone.0177923

Editor: Andrew C. Gill, University of Edinburgh, UNITED KINGDOM

Received: June 10, 2016; Accepted: May 5, 2017; Published: May 18, 2017

Copyright: © 2017 Kemmish et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All relevant data are within the paper and its Supporting Information files.

Funding: HK and LY are employed by Dassault Systèmes Biovia Corp. The funder provided support in the form of salaries for authors HK and LY, but did not have any additional role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript. The specific roles of these authors are articulated in the ‘author contributions’ section.

Competing interests: HK and LY are employed by Dassault Systèmes Biovia Corp and the study pertains to a marketed product, Discovery Studio. This does not alter our adherence to PLOS ONE policies on sharing data and materials.

Introduction

Recent advances and success in using antibodies in treating diseases, including cancer, inflammation and rheumatoid arthritis [1, 2], have created great interest in designing new antibody biologics. Building three-dimensional models from protein sequences is frequently an important step in the antibody design process, enabling researchers to study antibody properties such as stability, antigenicity, aggregation propensity, solubility, viscosity, and more. In addition, when used in combination with protein-protein docking methods, these models can be used to understand and predict antibody-antigen interactions.

Homology modeling is a well-established method, which has been shown to produce quite accurate models for a protein sequence if an X-ray structure of a protein with a sufficient degree of sequence similarity is available[3, 4]. The area of antibody design and engineering is a case for which homology modeling is particularly well suited, because in general the overall sequence and structural similarity between antibodies is very high. In particular, the framework regions of antibodies are very well conserved, with most of the variability occurring in the complementarity-determining regions (CDRs). This property of antibodies has led to the development of specialized structure prediction methods [5, 6, 7, 8, 9, 10, 11] which have been shown to outperform generalist methods [12].

Antibody structure prediction methods generally follow a two-stage approach. In the first stage, an accurate model of the framework regions (i.e. excluding the CDR regions) is constructed based on appropriate templates [5, 6, 7, 8, 9, 10, 11]. The framework templates are typically selected based on sequence similarity from a curated database of antibody structures. Models are then built either based on a single template for the whole structure [5, 7] or separate templates can be used for the VH and VL chains [5, 6, 10, 11]. In the latter case an additional step is required to determine the relative orientation of the chains [5, 6, 11].

In the second stage, the hypervariable loop regions of the structure are rebuilt. Five of the six CDR regions typically adopt a limited number of conformations [13] and can in most cases be accurately modeled by grafting the regions from an appropriate template [14]. However, a number of different approaches are possible for CDR template selection and loop grafting [5, 6, 7, 8, 9, 10, 11].

The H3 loop is more difficult to model because this region exhibits a much larger degree of variation in loop length and conformations adopted. For H3, ab initio loop modeling methods have been shown to increase accuracy compared to template based models in some cases [7, 8, 9].

A blind prediction experiment assessing various antibody structure prediction methods was performed in 2009[15]. The results of BIOVIA’s participation in this experiment generally validated our template-based modeling approach. However, it also identified some deficiencies in our modeling process. The lessons learned allowed us to improve our performance in the second instalment of the Antibody Modeling Assessment (AMA-II), which was executed in early 2013 (http://www.3dabmod.com) [5, 12].

Based on the further experience gained from this, we developed a fully automated antibody modeling protocol which can be run through the Discovery Studio [16] graphical user interface or in batch mode from the command line. This method is very fast, and can be run on multiple processors with coarse grain parallelization. In addition, the protocol can process combined heavy and light chain inputs, matching separate heavy and light chains by name, or performing permutations combining a set of heavy and light chains. These features make it an ideal solution for structure prediction of multiple sequences. On a standard desktop PC, a single Fv or Fab structure can be predicted in less than 6 minutes; using a server with 30 processors predictions for a set of over 150 Fv sequences can be completed within an hour.

The fully automated method was initially validated using the sequences from the AMA-II experiments; that study included a diverse set of sequences, but consisted of only eleven targets. We designed the current work to extend the validation of our methods to 157 antibody sequences for which structures are available, and to analyze the influence of several parameters to obtain better understanding of the effect on model quality. This analysis improves the recommendations we can offer when using our protocol to build antibody structures.

Materials and methods

Computational methodology

The automated structure prediction method consists of three stages: Framework template selection, framework model construction, and CDR refinement. These are run in succession without manual intervention after the specification of the initial run parameters.

Framework template selection.

Templates for each target sequence are selected by aligning against sequences in the Discovery Studio antibody database using a Hidden Markov Model [17], and then identifying those with the highest sequence similarity and identity. By default CDR regions are excluded from consideration but the user may choose to include them. The five best templates are found for the whole Fv or Fab region, and also for each of the light and heavy chains.

Framework model construction.

We evaluated the following framework model construction methods implemented in the software:

Single Template: This is the most straightforward approach, in which a model is built based on a single Fv/Fab framework template. This template, which contains both the light and the heavy chain regions, is selected by sequence similarity from a curated antibody database.
Chimeric: This method builds a model based on a chimeric template. This template is assembled from separate light and heavy chain templates. A third interface template, containing the whole Fv/Fab region, is used to determine the relative spatial orientation of the individual light and heavy templates. The templates are selected from the database by sequence similarity for the relevant regions. Note that the light or heavy templates can be identical to the corresponding domain of the interface template.
Top5: This approach builds a model by using up to five Fv/Fab framework templates simultaneously. The five templates which have the best sequence similarity to the target are identified in the database. However, any template with a similarity not within 10% of the best one are rejected so occasionally fewer than five templates may be used. The models are built based on a multiple sequence alignment of these templates to the target sequence. This is done using the capability of MODELER [18] to construct models based on multiple templates by simultaneously optimizing restraints from all of the templates. MODELER uses an additive distance restraint function that peaks at the equivalent distance between atoms in each template. The contribution for each template is weighted by local sequence similarity, as described in detail in the MODELER paper [18].

In each case, one or more models are built using MODELER, and the top model as ranked by the MODELER PDF Physical Energy is used for further refinement.

CDR refinement.

The top-ranking framework structure can then have any or all of its CDR loops refined. The CDR loops are located using the IMGT [19], Chothia [13, 20, 21], Honegger[22] or Kabat [23] numbering schemes.

Loop templates are identified based on alignment of the target to sequences in the antibody database which have identical CDR loop lengths. The templates may be filtered to use those which have the correct Chothia canonical type if available; the canonical type definitions [21] are shown in S1 File. The templates are ranked with a BLOSUM62 similarity score of the CDR region plus the stem residues. There is an additional ranking which favors templates that have high scores for the other two CDR loops in the domain. This can be beneficial as the conformation of the three loops may be interdependent. The final ranking is by crystallographic resolution. MODELER is used to build one or more new CDR models while keeping the framework region intact.

Validation dataset

Validation of the method requires predicting the structures of antibodies for which the structures have been experimentally determined, but which are not yet present in the template database. Therefore, the computations were performed using the templates present in the Discovery Studio 4.1 database, while the validation set was created by searching the Protein Data Bank (PDB) [24] for newer antibody structures. Sequences were retained regardless of their similarity to those in the database because real usage often involves predicting the structures of a highly similar series of sequences, which may have identical frameworks or loop regions. This yielded an initial set of 249 Fv target sequences. Any structure with missing residues within the light or heavy chain was excluded. The set was further pruned to 95% sequence identity, choosing representatives with more complete termini and/or having structures with better crystallographic resolution. This resulted in a validation set of 157 unique Fv sequences. These are listed in S1 Text.

While most of the structures were of good or reasonable crystallographic resolution below 3.0 Å, 16 were in the range from 3–5 Å and three had been determined by electron microscopy with ‘resolution’ above 13 Å. The deviations between the experimental and predicted models for the electron microscopy structures are as likely to be due to inaccuracies in the deposited structure as in the prediction, so they were excluded from the analysis, leaving 154 structures.

The organism classifications of this final set are 75 human, 68 mouse, 5 rabbit, 4 rhesus macaque and 2 chicken antibodies; note however that this includes engineered structures. 125 have kappa light chains and 29 have lambda light chains.

Loop length distributions, using the Chothia definitions, are shown in Fig 1.

Download:

Fig 1. Length distributions for each of the CDR loops.

https://doi.org/10.1371/journal.pone.0177923.g001

The vast majority of the target sequences have at least one template in the Discovery Studio database with a sequence similarity above 90% for the Fv domain and above 80% for all CDR regions except for H3, as is shown by the histograms in Fig 2.

Download:

Fig 2. Distribution of sequence similarity of query sequence to database templates.

CDR loops were defined using the Chothia numbering scheme.

https://doi.org/10.1371/journal.pone.0177923.g002

Validation calculations

Structures were predicted for the validation set using a variety of the available options:

Each of the framework template modeling methods
Building different numbers of models
IMGT, Kabat, Honegger and Chothia loop definitions, the latter with or without canonical filtering

Comparison of predicted and experimental models.

The predicted models were compared with the experimental X-ray structures using the same methods as were applied in the AMA-II assessment [12]. This entailed superimposing each predicted-experimental pair using the β-sheet core, and then calculating RMSDs of the peptide carbonyl atoms for the light and heavy chain framework regions and for each of the CDR loops as defined by the Chothia scheme. Carbonyl RMSDs are used as they are sensitive to variations such as peptide flips which are not revealed by the commonly-used C-α RMSDs. In addition, the deviation in tilt angle between the light and heavy chain regions was calculated, again using the same method as for the AMA-II work [5]. The RMSD and tilt angle information, together with details of the templates used in each prediction, are tabulated in S1 Table.

Further analysis.

Custom protocols were created in BIOVIA’s Pipeline Pilot to analyze and compare the predictions. Many of the results are presented using box plots which are a compact means of displaying the distributions for several sets of data on the same chart. The bottom and top of the box are at the first and third quartile respectively, while the line within it marks the median value and a dot marks the mean value. The ‘whiskers’ are calculated by the Tukey method as 1.5 of the lower and upper quartile ranges. In some plots, any outliers beyond these values are plotted as small squares, and this is the definition of ‘outlier’ used in parts of the discussion. The results obtained by different methods were also assessed using a pairwise t-test, with p values below 0.05 being considered to be statistically significant.

Detailed analysis was performed within Discovery Studio 4.5, utilizing its sequence alignment, structural superimposition and visualisation capabilities.

Results and discussion

Template based refinement of the CDR loops by homology modeling requires a template with identical loop length. The validation dataset contains 6 sequences in which no template was available for one of the loops using any of the loop definitions, and a further 4 and 5 cases respectively for the Kabat and Honegger definitions. Details can be found in S1 Text. The set of 11 predictions using the Honegger definition were examined; the similarities for the best framework model were all above 77%. The overall and framework RMSDs for this group versus the remainder of the set with framework similarity above 77% were compared. Fig 3 shows that in addition to the unsurprising decrease in accuracy of the overall RMSD, the quality of the framework models is also generally poorer. There are particularly large distortions if long CDRH3 loops are misplaced. These cases have been excluded from the remaining analyses, leaving 148 sequences for the Chothia and IMGT methods.

Download:

Fig 3. Effect of missing loop templates.

Framework (FR) and overall (FV) RMSDs for predictions with all loops modeled or not.

https://doi.org/10.1371/journal.pone.0177923.g003

Framework models

The predictions were run for the sequence set using each of the three framework modeling methods (Single, Chimeric and Top5) with all other conditions the same. Fig 4 is a boxplot for the RMSD of the predicted models versus the experimental structures for the framework (FR) and whole Fv region including CDRs (FV), and Table 1 lists the p values which show whether the results are significantly different. These show that the Top5 method yields more accurate framework models which result in better models overall, and there is little difference between the Single and Chimeric approaches.

Download:

Fig 4. Comparison of framework modeling methods.

Framework (FR) and overall (FV) RMSDs (Å) for the three methods.

https://doi.org/10.1371/journal.pone.0177923.g004

Download:

Table 1. Statistical comparison of framework modeling methods.

https://doi.org/10.1371/journal.pone.0177923.t001

As expected, the accuracy of the predicted model depends on the availability of sufficiently similar templates. Fig 5 shows the framework RMSD and tilt deviation for predictions using the Top5, Chimeric and Single methods. In each case, quite accurate results are generally obtained for all similarities above 85% but markedly worse below that, as shown in Table 2. All but four of the validation set do have at least one template in the database with >85% similarity.

Download:

Fig 5. Effect of framework template similarity.

(A) Boxplot of the RMSD binned by percentage similarity to the best template for the three framework modeling methods. (B) Boxplot of the tilt angle deviation.

https://doi.org/10.1371/journal.pone.0177923.g005

Download:

Table 2. Statistics for similarities above and below 85% for the Top5 method.

https://doi.org/10.1371/journal.pone.0177923.t002

The similarity value used for the Top5 plot is that of the highest similarity template available, so it is interesting to note that while the similarity of the other four templates used may be lower, the models produced by this method tend to be more accurate than those using just the single best template even for very high similarities.

The CDR loops are by default excluded from the similarity and identity calculations used to select the templates for framework modeling. Including them makes no significant difference to the overall accuracy of the models, as shown in Fig 6. However, as discussed below, there are some sequences for which including the CDRs is beneficial.

Download:

Fig 6. Effect of including CDR regions in similarity calculations.

https://doi.org/10.1371/journal.pone.0177923.g006

CDR loop refinement

The accuracy of the CDR loop modeling will depend on the quality of the initial framework model, as shown in Table 3, but also on the similarity of the loop templates. This trend is shown in Fig 7.

Download:

Fig 7. Effect of loop template similarity.

Boxplots showing the effect of similarity of each CDR region on the RMSD for that loop.

https://doi.org/10.1371/journal.pone.0177923.g007

Download:

Table 3. Statistics for CDR RMSD similarities above and below 85% framework similarity.

https://doi.org/10.1371/journal.pone.0177923.t003

It has been shown in previous studies that the accuracy of the loop models is related to its length [25]. This is particularly relevant in the case of H3.

However, even for quite long loops, reasonable models may be obtained if there is a highly similar template available. This is illustrated by Fig 8, which is a heat map of the average CDR RMSD for each H3 loop length/similarity combination present in the dataset. The Discovery Studio 4.1 database provided templates with a similarity above 85% for over a quarter of H3 loops with lengths greater than 13 (8 out of 29 cases); as the number of structures deposited in the PDB grows this should increase.

Download:

Fig 8. Heat map of average RMSD (Å) for CDR H3 length versus % similarity of best loop template.

https://doi.org/10.1371/journal.pone.0177923.g008

The predictions were run for the sequence set using the IMGT, Honegger, Kabat, Chothia and Chothia with canonical filtering loop definitions, all other parameters being the same. In Fig 9, while there is some variation, there is no clear overall best choice. Comparing each of the other definitions against Chothia, the only statistically significant differences are that the framework RMSD and tilt angle deviations are slightly worse using the Kabat definition (p values 0.05 and 0.03).

Download:

Fig 9. Effect of different loop definitions on RMSDs and tilt angle deviation.

https://doi.org/10.1371/journal.pone.0177923.g009

To examine the effect of different template choices in more detail, we identified the cases where different sets of templates had been selected by the Chothia definition with and without canonical filtering. Those which had reasonable framework templates (similarity >85% and RMSD < 1.0 Å) and with loop RMSDs differing by more than 0.5 Å were analysed. The predicted structures are available in S1 Dataset and S2 Dataset.

CDR L1.

Using the above criteria, in the comparison of canonical filtering against the unfiltered Chothia definition, there were two cases where canonical filtering was better than unfiltered (4LIQ_LH_FV and 4QWW_CD_FV) and one in which it was worse (4K7P_LH_FV).

In 4LIQ, the choice of the correct kappa kL1:2A canonical for all three templates produces the correct conformation around the Asn30 residue with phi ~ 60° and psi ~ -120°, which is in a ‘marginal’ region of the Ramachandran plot. Without filtering, two of the selected templates are of canonical type kL1:2B which have phi/psi in ‘allowed’ regions but are not correct in this context. The choice was made on the basis of the scores including the stem regions, which were very slightly better; however, apart from being the wrong canonical type, the similarity and identity for the loop itself were lower. The effect, as shown in Fig 10, a plot of backbone and C-Beta atoms for the X-ray structure and the two predictions, is quite localised.

Download:

Fig 10. 4LIQ CDR L1.

X-ray structure red, prediction using canonical filtering green, prediction using unfiltered Chothia loop definition blue.

https://doi.org/10.1371/journal.pone.0177923.g010

4QWW is rather more complicated. All the templates selected had the correct kL1:1 canonical type with or without the canonical filtering. In the case of filtering the templates selected were 4EBQ, 1AY1 and 1YQV. 1AY1 does not adopt the canonical conformation, with the Ser30 being flipped, however the predicted conformation was closer to the other two. The unfiltered prediction chose 4EBQ, 1AY1 and 3C09; the latter has better similarity than 1YQV but lower resolution. While its conformation is broadly similar to that of 4EBQ and 1YQV, favourable interactions with the L3 loop cause it to be displaced. The net result is that the predicted model tends towards the incorrect conformation of 1AY1.

The L1 loop of 4K7P is canonical kL1:2A. The templates of this type selected using filtering are 1MQK, 1F6L and 3V7A. However, the latter is flipped at Tyr30 relative to the other two; while the predicted loop lies closer to 1MQK and 1F6L at residues 29 and 31, it adopts the flipped conformation at residue 30. The templates chosen without canonical filtering are 1F6L, 2FR4, 1P7K. This set has a high ranking because they are also high scoring templates for the L2 and L3 loops. 2FR4 and 1P7K do not belong to any canonical type, but differ from type kL1:2A only in having a leucine rather than isoleucine at residue position 2 and adopt similar conformations to 1F6L.

The results for 4QWW and 4K7P suggest that it might be beneficial to check that the conformation of templates adheres to the canonical type.

CDR L2.

For the L2 loop, 4JO4_LH, which is one of the rabbit sequences, has canonical type kL2:1. With canonical filtering, the selected templates are 2CMR, 1LK3 and 1OP3. Without filtering, the latter is replaced by 1DFB. All templates are of the correct canonical type and have 100% identity. The conformations are all correct except for some deviation at residue 52 for 1DFB, which does not explain well why the predicted loop is flipped at the 50–51 peptide bond. Examination of a run in which 50 models were generated for each loop shows that this is an anomalous result, with only three of the models adopting the flipped conformation relative to the templates. This is shown in Fig 11A.

Download:

Fig 11. Predicted CDR L2 loop and templates for 4JO4 and 4C83.

(A) 4J04; 3 anomalous conformations are shown in red; the other 47 in blue are close to the templates shown in yellow. (B) 4C83; 3 anomalous conformations are shown in red; the other 47 in green are close to the templates shown in yellow.

https://doi.org/10.1371/journal.pone.0177923.g011

Conversely, in the case of 4C83_BA, accurate results are produced by the unfiltered method but there is a flip of the 50–51 peptide bond for the model obtained with filtering. The L2 loop of 4C83 is of type kL2:1 and the canonically-filtered templates are 2W9D, 4F33 and 4I9W. Without filtering, the latter is replaced by 3NCY which is also of the correct canonical type. As before, all the templates are in the correct canonical conformation and are 100% identical. Examination of a run with canonical filtering generating 50 models again shows that most do adopt the same conformation as the template but three are flipped. This is shown in Fig 11B. So, it seems that the differences in these cases are not really due to the method of choosing the templates but are an artefact of model building.

4JG1_LH is a case in which canonical filtering yielded a poor set of templates. The canonical type is kL2:1, which requires isoleucine or valine at residue 48 and glycine at 64 The templates matching this type which were selected were 4D9Q, with a similarity and identity of 67%, while the other two, 2ADF and 1FJ1, had no similarity at all for the three loop residues. This pair adopted the conformation typical for the canonical whereas 4D9Q did not. Without filtering, the three templates selected (3BQU, 1I8K and 3I9G) had similarities of 100% with two being 100% identical. This set of templates had similar conformation to 4D9Q and correctly modelled the loop. However, they differed from the canonical definition by having a serine instead of glycine at residue 64. These are shown in Fig 12 together with some examples of typical kL2:1 canonical loops for comparison.

Download:

Fig 12. 4JG1 L2 loop.

X-ray structure red, prediction with canonical filtering green, prediction with no filtering blue. Templates with no filtering cyan; 4D9Q orange; 2ADF and 1FJ1 white. Some typical kL2:1 loops are shown in purple for comparison

https://doi.org/10.1371/journal.pone.0177923.g012

CDR L3.

There were no cases found where the RMSD for the L3 loop differed by more than 0.5Å between the filtered and unfiltered templates.

CDR H1.

4M5Y_LH benefits from canonical filtering, by selecting templates 3CX5, 4HC1 and 2XA8 which are all of the correct canonical type, H1:2. The templates have reasonable similarity and identity to the target, and adopt the same conformation. Without filtering, the first two of these are selected but the highest ranked template is 2XZC, which has a very high score because of high identity of the loop and stem regions. But it does not conform to any canonical type and adopts a different conformation which dominates the prediction.

In the case of 4O02_LH, all the selected templates were of canonical type H1:1 and had 100% similarity, 71.4% identity and a score of 89. They were 1WT4 in both cases, plus 3CMO and 1EHL with filtering. The unfiltered selection, which includes in its ranking criteria the scores of the other two loop regions, chose 1A6V_I and 1A6V_J. All the templates had the correct canonical conformation except for 1A6V_J, which differed at residues 29–30; the predicted model was similar to this. 1AV6 is a structure which contains three non-crystallographically related copies demonstrating the variability in conformation which can arise due to packing. Comparing the loops, the main-chain RMSDs are between 1–1.7Å for H1, 1.3–1.8Å for H2 and 1.1–1.5Å for H3.

4LEO_BA has a >0.5Å worse RMSD for the H1 loop for the prediction using canonical filtering versus unfiltered Chothia, but neither is very accurate (1.7Å and 1.2Å). The templates (3EO0_B and 1MJ8_H for both, plus 1MH5_B with filtering and 2UYL_B without) all had similarity/identity of 71%; this is relatively low compared to most of the other cases.

CDR H2.

4NKI_LH is a case where using canonical filtering yielded a much better result for H2 versus the unfiltered Chothia definition (RMSD for the loop residues 0.4Å vs 2.4Å). The target sequence has canonical type H2:3. With filtering, the selected templates were 3HI6_H, 3HI5_H and 3KYM_B (Fig 13A), whereas without filtering the templates chosen were 3HI6_H and 3K2U_H, 2WUC_H, the last two of which are canonical H2:2 (Fig 13B). The reason for this choice was that they have slightly better scores for the loop region including the stems; however they adopt a significantly different conformation especially at Pro52A.

Download:

Fig 13. 4NKI_LH H2 loop.

X-ray structure red, predicted structure blue, H2:3 templates green, H2:2 templates yellow. (A) with canonical filtering. (B) without canonical filtering.

https://doi.org/10.1371/journal.pone.0177923.g013

In 4G80_BA, the canonical type is H2:2, and the templates chosen using filtering are 1NJ9_B, 3EFD_H and 3IVK_A. The latter adopts an anomalous conformation at Pro52A but the prediction is closer to the other two; this produces a good fit, with a loop RMSD of 0.66Å. Without filtering, the templates are again 1NJ9_B, 3IVK_A and the H2:3 canonical 1SEQ_H. In addition to being the wrong canonical type, 1SEQ_H also has similarity/identity of only 25% but has a better score including stem regions than 3EFD. It has a drastically different conformation and the resulting prediction, having only one template with a typical canonical conformation, is inaccurate with a loop RMSD of 1.56Å.

Filtering by organism

The protocol allows the choice of templates for both the framework and CDR loops to be restricted to those from a specified organism. Running a prediction on just the sequences classified as human, with canonical filtering and specifying the organism as ‘human’, 47 sequences had at least one loop with no templates found, of which 23 had no loops matched and so no model created. Predictions on the sequences classified as mouse with filtering by organism ‘mouse’ resulted in 37 cases where at least one loop could not be modelled. Combining the results for which all loops were predicted using organism filtering and comparing against the prediction with no organism filtering showed little overall difference, with some loop types being rather worse with the filtering. So in general it does not appear to be beneficial to use this option. It should be noted that the taxonomic classification will not be correct in the case of engineered antibodies, as one or more loops may not derive from the organism of the rest of the structure.

Effect of number of cycles of refinement

The predictions were run to generate N framework models using the Top5 method, and then for the best of these, N loop models using the Chothia definition, for N = 1, 10, 25 and 50. Fig 14 shows that very little difference can be seen between the overall results for the framework or loop regions. There is a slight improvement for the framework and loop regions except for H3 on increasing from 1 model to 10, but little change thereafter. The differences are only statistically significant for the framework region and loops L2 and H2, and probably not large enough to be meaningful in practice.

Download:

Fig 14. Effect of number of cycles of refinement.

Box plots showing variation in framework and loop RMSDs for different numbers of framework models and CDR refinement cycles.

https://doi.org/10.1371/journal.pone.0177923.g014

Examination of outliers

In order to understand factors affecting the accuracy of the predicted structures, outliers were examined to see what might be giving rise to unusually large discrepancies from the X-ray structures in a small subset of cases. As is clear from Fig 5, a major consideration is whether high similarity templates are available in the database. To account for this, the analysis was performed considering only those for which there was an overall Fv template with similarity above 85%. In addition, cases without loop templates of the correct length are also likely to be unreliable so these were excluded from this part of the analysis. The predictions generated using the Top5 framework template method and Chothia loop numbering were used for the analysis; these structures are available in S1 Dataset.

Framework region.

Fig 15A shows that the framework RMSD for 75% of the predicted structures is within 0.9Å of the experimental structure, with a median value of 0.7Å and all below 1.5Å. The four outliers with RMSDs above 1.3Å are listed in Table 4: 3ZL4_LH (1.5Å), 4QHM (1.5Å), 4LVH_CB (1.4Å), 4QHM (1.5Å) and 4MWF_LH (1.4Å). Fig 15B shows that the tilt angle deviations are generally quite low, with 75% falling below 6°. There are five outliers with angles above 9° shown in Table 5: 3ZL4_LH (15.6°), 4CNI_BA (13°), 4NIK_BB (11.4°), 4MWF_LH (10.0°) and 4FZE_LH (9.9°). Unsurprisingly, some structures are outliers for both RMSD and tilt deviation.

Download:

Fig 15. RMSD and tilt angle deviation for predictions with good templates available.

https://doi.org/10.1371/journal.pone.0177923.g015

Download:

Table 4. Framework RMSD outliers.

https://doi.org/10.1371/journal.pone.0177923.t004

Download:

Table 5. Tilt angle deviation outliers.

https://doi.org/10.1371/journal.pone.0177923.t005

The best ranked template for 3ZL4_LH is 2XZC_LH. The sequences of the Fv regions have a very high identity, differing only in the last four residues of the L chain and one near the end of the H chain. Superimposing the X-ray structure onto this template and calculating the framework RMSD gives a value of 1.7Å, similar to the discrepancy in the predicted model. However, examination of the full sequence shows that the light chain of 3ZL4 has lambda variable and constant domains, whereas in 2XZC there is a kappa constant domain[26]. The structure was engineered in order to investigate the effect of switching between kappa and lambda constant domains on the structure and functionality of the antibody; this was found to cause a 12° change in the elbow angle. If the structure of the full Fab domain is predicted, all of the templates used for the framework model have a lambda constant domain and the RMSD for the framework of the Fv region of this structure falls to within 1.1Å of the X-ray structure.

Examining 4LVH_CB, the discrepancies in the framework lie mainly at the N-termini of the L and H chains, which appear to be misaligned by one residue. The misalignment in the L chain occurs around Pro8, which is in the trans form in the X-ray structure but adopts the cis conformation in the predicted model. The most similar template has a trans Pro8 but the other four are cis. In the case of the H chain, the discrepancy arises in the region before Gly8, Gly9 and Gly10. This highly flexible region allows for variation in the templates which is reflected in the model structure. An examination of the experimental structure for violations shows that there are 45 non-planar peptide bonds whereas there are none in the predicted structure; comparison of Ramachandran plots similarly shows fewer violations in the predicted structure. It is unsurprising that a modeled structure does not replicate these violations.

In 4QHM_BA, the most obvious structural difference is in the turn between Gln39 and Leu45 of the heavy chain. Examining the relationship to the templates, it is evident that while the overall similarity to the target is within 10% of that of the best template, the discrepancy is greater if the CDR regions are also considered. This is shown in Table 6. In this case the inclusion of the lower-similarity templates appears to be leading to sub-optimal modeling of some regions; the single template method yields a lower framework RMSD (1.0Å). However, the overall RMSDs for the models produced by the two methods are very similar (1.49Å and 1.42Å). Better results (framework RMSD 1.0Å, overall RMSD 1.1Å) are obtained in this case by not excluding the CDR regions from the similarity and identity calculation; the top 5 overall templates are still the same but only the first two are used as the similarities of the others are over 10% poorer.

Download:

Table 6. Similarities for the 4QHM_BA templates.

https://doi.org/10.1371/journal.pone.0177923.t006

4MWF_LH is an outlier both for RMSD and tilt angle deviation. The H3 loop is 16 residues long and includes a disulfide bridge. It adopts a significantly different conformation from the H3 loops used to build the framework model, and even more different from any of the templates used for CDR modeling, which only have similarities of 25–31%. It is likely that this large discrepancy in the final loop conformation causes the inaccurate orientation of the domains.

4CNI_BA and 4FZE_LH may be outliers for tilt angle because they adopt VL-VH orientations towards the extremes of the distributions found using the ABangle webserver [27] for at least one of its measures. In the case of 4FZE, this is particularly marked for the HL angle, whereas for the template 4JZO (which is used twice as it exists in two slightly different forms in the crystal structure) this angle lies towards the other extreme. Using the single template method for framework prediction with the template 3MA9 yields a much more accurate structure with tilt deviation 3.7°, framework RMSD 0.6Å and overall RMSD 1.5Å.

Loop regions.

The loop regions were examined in a similar way to identify factors other than template similarity which could adversely affect accuracy. This part of the analysis was therefore further restricted to the cases for which the loop being investigated had at least one template with similarity above 85%. The RMSD ranges for these are shown in Table 7 and Fig 16.

Download:

Fig 16. RMSDs for CDRs for predictions with good templates available.

https://doi.org/10.1371/journal.pone.0177923.g016

Download:

Table 7. RMSD statistics for the CDR loops.

https://doi.org/10.1371/journal.pone.0177923.t007