Figures
Abstract
Profilin 1 (PFN1) protein plays key roles in neuronal growth and differentiation, membrane trafficking, and regulation of the actin cytoskeleton. Four natural variants of PFN1 were described as related to ALS, the most common adult-onset motor neuron disorder. However, the pathological mechanism of PFN1 in ALS is not yet completely understood. The goal of this work is to thoroughly analyze the effects of the ALS-related mutations on PFN1 structure and function using computational simulations. Here, PhD-SNP, PMUT, PolyPhen-2, SIFT, SNAP, SNPS&GO, SAAP, nsSNPAnalyzer, SNPeffect4.0 and I-Mutant2.0 were used to predict the functional and stability effects of PFN1 mutations. ConSurf was used for the evolutionary conservation analysis, and GROMACS was used to perform the MD simulations. The mutations C71G, M114T, and G118V, but not E117G, were predicted as deleterious by most of the functional prediction algorithms that were used. The stability prediction indicated that the ALS-related mutations could destabilize PFN1. The ConSurf analysis indicated that the mutation C71G, M114T, E117G, and G118V occur in highly conserved positions. The MD results indicated that the studied mutations could affect the PFN1 flexibility at the actin and PLP-binding domains, and consequently, their intermolecular interactions. It may be therefore related to the functional impairment of PFN1 upon C71G, M114T, E117G and G118V mutations, and their involvement in ALS development. We also developed a database, SNPMOL (http://www.snpmol.org/), containing the results presented on this paper for biologists and clinicians to exploit PFN1 and its natural variants.
Citation: Pereira GRC, Tellini GHAS, De Mesquita JF (2019) In silico analysis of PFN1 related to amyotrophic lateral sclerosis. PLoS ONE 14(6): e0215723. https://doi.org/10.1371/journal.pone.0215723
Editor: Salvatore Adinolfi, King's College London, UNITED KINGDOM
Received: June 5, 2018; Accepted: April 9, 2019; Published: June 19, 2019
Copyright: © 2019 Pereira et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All relevant data are within the manuscript, Supporting Information files, and on Figshare: https://figshare.com/s/d75b0beae0327a7addce. Data is also available at http://www.snpmol.org/.
Funding: This study was supported by Fundação Carlos Chagas Filho de Amparo à Pesquisa do Estado do Rio de Janeiro (FAPERJ) (http://www.faperj.br/) to GRCP, Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES) (http://www.capes.gov.br/) to GRCP, Financiadora de Estudos e Projetos (FINEP) (http://www.finep.gov.br/) to GRCP, Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq) (http://cnpq.br/) to GRCP and NVIDIA Corporation to GRCP. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: This study was supported by NVIDIA Corporation to GRCP. This does not alter our adherence to PLOS ONE policies on sharing data and materials. There are no patents, products in development or marketed products associated with this research to declare.
Introduction
Amyotrophic lateral sclerosis (ALS) is a neurodegenerative disease that progressively affects the upper and lower motor neurons, leading to muscular atrophy and paralysis due to neuron injury and death [1]. ALS is the most common adult-onset motor neuron disorder [2] with an estimated economic burden of over one billion dollars a year in the United States only [3]. Due to the lack of effective treatments, ALS leads to death within 2 to 5 years after the diagnosis, usually due to respiratory paralysis [4]. Most ALS cases are sporadic (sALS); however, 5–10% of the ALS cases are familial (fALS) and related to genetic causes [5].
Four non-synonymous single nucleotide variants (nsSNVs) in the PFN1 gene were described as being involved with fALS development [6,7]. Interestingly, these mutations were also found in sporadic cases of ALS [8]. The PFN1 gene encodes profilin 1 (PFN1), a 140-residues ubiquitously expressed [9] cytosolic protein [10] that plays key roles in the regulation of actin cytoskeleton [11].
PFN1 is crucial for monomeric actin conversion into filamentous actin, as it sequestrates cytosolic actin monomers and catalyzes the assembly of monomers into filamentous-actin [9]. PFN1 also interact with poly-L-proline (PLP) sequences and major proline-rich protein families, such as vasodilator-stimulated phosphoproteins (VASP), which participates of the nucleation and elongation of actin filaments. PFN1 interaction with these cytoskeleton regulators is an important generator of actin-based structures [12]. Previous studies have shown that PFN1 is also an important regulator of cell motility events, including migration and invasion of breast cancer and vascular endothelial cells. Furthermore, disrupted PFN1 interactions, as well as reduced PFN1 expression have been shown to cause impaired capillary morphogenesis and defects in neurite development [13].
Moreover, PFN1 is involved in many cellular processes [11] through the interaction with diverse binding partners [14], including structural proteins in neurons, growth factors [9], ribonuclear particles [15] and proteins involved in signaling cascades [9]. PFN1 also plays important roles in membrane trafficking [16], RNA processing and transcription [9], GTPase signaling [17], and neuronal growth and differentiation [16]. In neurons, PFN1 is essential for neuronal development, formation and maintenance of the neuronal cytoskeleton, synaptic formation and activities, as well as growth of dendrites and axons [8].
ALS-related mutations in PFN1 are known to cause cytoskeletal disruption in neurons [10], resulting in axonal dysfunction and retraction. This leads to synaptic failure with consequent denervation of post-synaptic motor neurons [18]. Cytoskeletal defects plays a major role in motor neuron diseases and contributes importantly to ALS pathogenesis [19]. It is also known that PFN1 mutations cause proteostasis disturbances [14], which are evidenced by the presence of biological markers, such as formation of cytoplasmic protein inclusions [10] and accumulation of ubiquitin and p62 [20]. PFN1 mutations are known to destabilize PFN1 resulting in structural perturbations that lead to protein aggregation [17]. Protein misfolding and aggregation result in proteostasis network disturbance, which is believed to contribute to early events in ALS pathogenesis [21]. Thus, studying the PFN1 missense mutations may contribute to a better understanding of the ALS pathophysiology.
Next-generation sequencing experiments reveal millions of novel SNVs [22]. However, the experimental characterization of their effects is extremely expensive, time-consuming and difficult [23]. The computational simulations, also known as in silico analysis, allows the prediction of SNV effects in a faster, cheaper and efficient way [4]. The computational approach is then beneficial in prioritizing the most probable disease-related mutations [23] to be narrowly examined with wet-lab experiments [4]. Moreover, already known disease-related mutations can also be studied in silico to identify pharmacological targets for relevant treatments and to gain insight into their molecular mechanisms of pathology [23]. In this scenario, the computational simulations have become an important ally of the experimental methods [4] and an essential approach for the study of SNVs [22,23].
Optimal protein-drug binding is crucial to achieving the desired therapeutic effects, as well as to minimizing associated side effects and toxicity of drugs. Protein-drug interactions are determined by local biochemical and structural features of drug-binding cavities [24]. Residues outside drug-binding cavities can also have long-range effects on these sites and, consequently, influence protein-drug binding [25]. Thus, key amino-acid residues in proteins are essential for maintaining the structural properties of binding sites and for the formation of non-covalent interactions with drug molecules [24]. In this sense, nsSNVs affecting key protein residues can impact drug binding-sites, resulting in alterations in drug binding affinity and selectivity [26].
In this work, we applied computational simulations, following the methodology previously established by our group [4,27,28], to the study of PFN1 nsSNVs, which were described as related to ALS development [6,7]. We aim at the characterization of the PFN1 nsSNVs and their effects on protein structure and function. Here, we applied ten functional and stability prediction algorithms, an evolutionary algorithm and molecular dynamics simulations to a thorough analysis of PFN1 nsSNVs. Our findings suggested that these nsSNVs could affect PFN1 flexibility, which could be therefore related to ALS development. We also developed an database containing the results presented in this paper for biologists and clinicians to exploit PFN1 and its natural variants.
Since these nsSNVs may influence drug selection, dosing, and adverse effects, understanding their effects on PFN1 structure and function may help the development of new drugs and personalized therapies for ALS [22].
Materials and methods
Sequence, structure and natural variants retrieval
The sequence and natural variants of PFN1 were retrieved from the UniProt database (UniProt ID: P07737) [7]. The structure of and the wild-type PFN1 was retrieved from the Protein Data Bank (PDB) database (PDB ID: 1PFL) [29].
Functional and stability prediction analysis
The functional and stability effects of the PFN1 nsSNVs were predicted using the following algorithms: PhD-SNP [30], PMUT [31], PolyPhen-2 [32], SIFT [33], SNAP [34], SNPS&GO [35], SAAP [36], nsSNPAnalyzer [37], SNPeffect4.0 [22] and I-Mutant2.0 [38].
Evolutionary conservation analysis
The evolutionary conservation analysis of PFN1 was performed using the ConSurf server, which determined the degree of evolutionary conservation of each amino-acid of PFN1 [39]. The following parameters were selected for this analysis: PDB ID: 1PFL; Chain identifier: A; homologous search algorithm: PSI-BLAST; number of iterations: 3; E-value cut-off: 0.0001; protein database: UniProt; reference sequence: closest; number of reference sequences selected: 150; maximum sequence identity: 95%; minimum identity for counterparts: 35%; alignment method: MAFFT-L-INS-i; calculation method: Bayesian; and evolutionary substitution model: best model (default).
Molecular dynamics simulations
MD simulations of the wild-type PNF1 and its natural variants: C71G, M114T, E117G and G118V, were performed using the GROMACS 2018.2 package [40]. Mutator Plugin 1.3 [41], which is available in the Visual Molecular Dynamics (VMD) 1.9.1 software [42] was used to induce the C71G, M114T, E117G and G118V substitution on the experimentally determined structure of wild type PFN1 (PDB ID: 1PFL) [29].
Following the methodology previously established by our group [4], we selected the amber99SB-ILDN as the force field of the simulations. Amber99SB-ILDN is an improved version of the amber99SB force field [43], which is widely used in MD simulations of proteins [44]. The new side-chain torsion potentials of amber99SB-ILDN are clearly improved and do not cause undesirable side effects [43]. Amber99SB-ILDN proved to be a good choice for the MD simulation of proteins [44], since this force field accurately descript many protein structural and dynamical properties [45]. Amber99SB-ILDN is therefore recommended for the simulation of protein dynamics [43,44].
The structures were solvated using the TIP3P water model inside a dodecahedral box of dimensions 44 x 37 x 34 Å. The systems were neutralized by adding Na+ and Cl− ions and minimized for 5000 steps using the steepest descent method.
After system minimization, three other steps were carried out in the MD simulations: NVT (constant number of particles, volume, and temperature), NPT (constant number of particles, pressure, and temperature) and production. The NVT ensemble was followed by the NPT ensemble at 1 atmosphere and temperature of 300 K for the duration of 100 ps [4]. Parrinello-Rahman was selected as the barostat and v-rescale was selected as the thermostat of the NVT and NPT ensembles.
The production simulations were performed in triplicates at 300 K for the duration of 100 ns for the wild-type PFN1 and its variants. The LINCS (linear constraint solver) algorithm was applied to constrain covalent bonds [46], and the electrostatic interactions were processed using the particle mesh Ewald (PME) method [47]. The time step of 0.002 ps was selected for the simulations and the MD trajectories were recorded every 10 ps [4].
Structural parameters of the wild-type PFN1 and its variants were accessed through the root-mean-square-deviation (RMSD), root-mean square-fluctuation (RMSF), radius of gyration (Rg), intramolecular hydrogen bonds (Hb) and B-factor analyses. These parameters were calculated separately for each triplicate trajectory. The means for each triplicate in the RMSD, RMSF, RG and intramolecular Hb analyses were calculated and plotted using the ggplot2 package in R software [48].
The following GROMACS distribution programs were used to perform the MD analyses: gmx hbond, gmx rms, gmx rmsf, and gmx gyrate.
PFN1 database development
The results presented in this paper were compiled and stored on SNPMOL, an online database. The human-curated database of PFN1 was developed using JSmol, an HTML5-based equivalent of Jmol [49].
Results and discussion
Sequence, structure and natural variants retrieval
PFN1 is a 140-amino acid cytoskeletal protein that is coded by the PFN1 gene [7], which is located on chromosome 17p13.2 [50]. Four natural variants of PFN1 were described as related to the development of ALS type 18 [7] structure, i.e., PDB ID: 1PFL, experimentally determined by nuclear magnetic resonance (NMR) spectroscopy [9,29].
PFN1 protein has two important domains: an actin-binding domain and a poly-L-proline (PLP) binding domain [9], which are essential for PFN1 to perform its biological functions [9,12]. The actin-binding domain of PFN1 is located on its helix 3 and part of its strands 4, 5 and 6, whereas the PLP binding domain is lo’cated on the N and C terminal helices [9,15,16]. Moreover, the residue threonine 89 (T89) is an important site of PFN1, which is phosphorylated by PKA. The phosphorylation of T89 was predicted to potentially increase the PFN1 affinity for actin. This post-translational modification is believed to be a regulatory mechanism of PFN1-dependent actin polymerization processes. Moreover, several changes were observed by inducing the T89D mutation in PFN1, including detergent insolubility, protein aggregation and accelerated proteolysis, which suggested that the T89 residue is structurally important for PFN1 [51].
A schematic representation of PFN1 containing its natural variants and important domains are shown in Fig 1. As shown in Fig 1, all studied PFN1 nsSNVs lead to amino acid substitutions in regions that are spatially close to the actin binding and PLP binding domains of the protein. It is believed to be related to the impaired actin-binding ability and altered PLP-binding ability of the PFN1 ALS-related variants [9,16].
The PLP binding domain and the actin-binding domain of PFN1 are represented in blue and green, respectively. The mutation sites: C71, M114, E117 and G118, are represented in red. The dark yellow arrow shows the residue threonine 89. (A) Tridimensional structure of PFN1 (PDB ID: 1PFL). (B) Schematic representation of PFN1.
Functional and stability prediction analysis
The functional and structural consequences of nsSNVs at the protein level can be predicted using computational simulations [52]. The effects of amino acid substitutions on PFN1 function were analyzed using eight different algorithms. The mutations C71G and G118V were predicted as deleterious by the eight functional prediction algorithms that were used. The M114T mutation, in turn, was predicted as deleterious by seven of the eight algorithms, while the E117G mutation was predicted as deleterious by four of the eight algorithms (Fig 2).
The four known nsSNVs of PFN1 were analyzed using eight different functional prediction algorithms. The bar plot indicates the number of neutral and deleterious predictions of each PFN1 nsSNV, according to the used algorithms. Blue bars indicate neutral predictions while red bars indicate the number of deleterious predictions.
In the test case we performed, the algorithms: SAAP, SIFT, SNAP, and SNPs&GO, showed the best accuracy amongst the used functional prediction algorithms. They were able to detect the known deleterious effects of the studied PFN1 mutations [7]. The PhD-SNP algorithm presented the worst accuracy in the test case we performed, as it was not able to detect the known deleterious effects of the M114T and E117G mutations [7] (Table 1).
Despite the high accuracy in detecting the known deleterious effects of C71G, M114T and G118V, the algorithms that were used showed low accuracy in predicting the known deleterious effect of the E117G variant of PFN1. These algorithms apply different strategies to make predictions [28]. Moreover, there is no established gold standard method to predict the functional effects of mutations [53]. Thus, it is important to combine the results of a variety of algorithms to determine the deleterious effects of mutations, as previously demonstrated by our group [4,28,54]. The test case we performed reaffirms the importance of the combined usage of algorithms when proceeding predictive functional analysis. The divergent results and the weaknesses of functional prediction algorithms evidence the need of improving such methods.
The effects of amino acid substitutions on PFN1 stability were further analyzed using the FoldX [55] and I-Mutant2.0 [38] algorithms. According to I-Mutant2.0 and FoldX, the mutations C71G, M114T and E117G decrease PFN1 stability. The mutation G118V, in turn, was predicted as destabilizing for FoldX and stabilizing for I-Mutant2.0. Recently, Boopathy et al. [16] showed that the ALS-related mutations: C71G, M114T, and G118V, but not E117G, destabilize PFN1 in vitro [9,16].
The divergent results presented in the stability prediction analysis may occur due to the different prediction strategies applied by I-Mutant2.0 and FoldX [22,38]. While FoldX is an algorithm trained in a database of engineered proteins [55], I-Mutant 2.0 uses information from a database of experimentally determined structures to predicted the effect of mutations on protein stability [38].
Lastly, the effects of amino acid substitutions on PFN1 aggregation tendency (TANGO), amyloid propensity (WALTZ), and chaperone binding tendency (LIMBO) were analyzed using the SNPeffect4.0 algorithm [22]. According to SNPeffect4.0, none of these mutations affect the PFN1 aggregation tendency, amyloid propensity, and chaperone binding tendency. Interestingly, the protein variants: C71G, M114T and G118V, are known to aggregate in vitro [19].
Evolutionary conservation analysis
ConSurf is a bioinformatics tool that analyzes the evolutionary conservation of protein regions and calculates the conservation score of each amino acid based on statistical inference methods, machine learning, and multiple sequence alignments. The conservation scores are associated with a coloring scheme and projected on the protein’s surface. ConSurf is widely used to detect functional regions on proteins as important residues are usually conserved throughout evolution [39].
The evolutionary conservation score of each amino acid of PFN1 was calculated by ConSurf (Fig 3). Highly conserved positions are colored maroon, average conserved positions are colored white, and variable positions are colored turquoise [39]. According to ConSurf, all PFN1 mutations occur in conserved positions, which indicate that these variants probably affect important PFN1 sites. It might explain the association of these mutations with ALS development. Moreover, PFN1 has two major areas composed of structural conserved amino acids, which correspond to the actin binding domain and adjacent residues, as well as the PLP-binding domain. These regions are crucial to PFN1 performs its biological function [14], which probably contributed to their structural conservation throughout the evolution [39].
The PFN1 conservation profile shown in three different angles. Each PFN1 amino acid is represented as a space-filling model and colored according to its conservation score. The ConSurf coloring scheme is shown in the color-coding bar. According to ConSurf, the positions 71, 114 and 118 are highly conserved, while the position 117 is average conserved.
In addition to showing the conservation scores of PFN1 mutated sites, the ConSurf analysis also provided an interesting graphical representation in which the conservation scores for amino acid of PFN1 is plotted on its three-dimensional protein structure, highlighting its conserved regions and structural proximities.
Molecular dynamics simulations
MD is an in silico method of solving Newtonian equations of motions for a given set of atoms [56]. This method aims to reproduce the real behavior of molecules, such as proteins, in their environment. Unlike the static pictures obtained from methods such as X-ray crystallography [4], the molecular trajectories generated by MD simulations provide detailed information on changes in protein conformation and fluctuation. This information can be used to assess structural parameters of proteins, such as flexibility and stability [57]. As changes in protein flexibility and stability may lead to the development of pathologies [52,58,59], the impact of mutations on protein structure and function can be understood using MD simulations (Vinay Kumar et al., 2014).
To further analyze the effects of PFN1 nsSNVs we carried out MD simulations of the wild-type PFN1 and its four natural variants [40] using the GROMACS 5.0.7 package [40]. The NMR structure of PFN1 (PDB ID: 1PFL) was used as the wild type structure. The tridimensional structures of the C71G, M114T, E117G and G118V variants were generated by inducing the respective amino acid substitutions on the wild type PFN1 using the VMD software (Version 1.9.1) [42]. The MD simulations of the wild-type PFN1 and its natural variants were carried out for 100ns. The generated trajectories were evaluated according to their RMSD, RMSF, RG, intramolecular Hb and B-factor characteristics.
RMSD is a useful parameter to analyze the structure motions over time and to determine its spatial convergence throughout the simulation [4,56,60]. As shown in Fig 4, the average RMSD values of the C71G (0.1875±0.02nm), M114T (0.2091±0.02nm), E117G (0.2480±0.02nm), and G118V (0.2415±0.3nm) variants are similar to the wild-type PFN1 (0.2248±0.04nm). It indicates that the PFN1 variants diverge from the initial position as much as the wild-type PFN1. Moreover, the establishment of a plateau in the RMSD values, observed in all simulations (Fig 4), suggests that the structures fluctuate around an average stable conformation, thus making sense to assess its local fluctuations [56,60]. The E117G simulation reached a plateau of RMSD values first (around 25ns), followed by the wild-type (around 40ns), M114T(around 60ns), G118V(around 65ns), and C71Gsimulations (around 70ns), respectively.
The RMSD for the backbone atoms of the wild-type structure and variants at 300K shown as a function of time. The wild type is represented in black,variant C71G is represented in red, variant M114T is represented in blue, variant E117G is represented in green, and variant G118V is represented in purple.
The RMSD analysis, however, only provides information about the overall structure fluctuations [61]. We then performed RMSF analysis to obtain local information. RMSF is a useful parameter to describe the flexibility of protein residues throughout the simulation [4,61]. As shown in Fig 5, all studied variants presented altered flexibility in the actin-binding, PLP-binding domains and adjacent regions throughout the simulations when compared to the wild-type PFN1. However, none of the variants presented altered flexibility at the residue threonine 89.
The RMSF of each residue of the PFN1 wild-type and variants at 300K is shown. Schematic representations of PFN1 domains and secondary structure are shown to further comparison. The PLP binding domain and actin-binding domains of PFN1 are represented in blue and green, respectively. The PFN1 mutation sites are colored red. Alpha-helices are represented by magenta arrows, beta-strands are represented by yellow barrels, and the coils are represented by the thin black lines. The dark yellow line shows the residue threonine 89. (A) The wild type is represented in black and variant C71G is represented in red. (B) The wild type is represented in black and variant M114T is represented in blue. (C) The wild type is represented in black and variant E117G is represented in green. (D) The wild type is represented in black and variant G118V is represented in purple.
The C71G variant presented increased flexibility at the actin-binding domain and adjacent regions, especially at the region comprised between the residues 50–56 and 75–79. It also had an increased flexibility at the N and C-terminal helices of the PLP-binding domain. In addition, this variant presented increased flexibility especially at the coils regions.
The M114T variant, in turn, presented reduced flexibility at the actin-binding domain and adjacent regions, especially at the region comprised between the residues 73–82 and 92–94. It also had increased flexibility at the N-terminal helix of the PLP-binding domain and decreased flexibility at the C-terminal helix of the PLP-binding domain. Moreover, this variant presented decreased flexibility especially at the coil and helices regions.
The E117G variant presented reduced flexibility at the actin-binding domain and adjacent regions, especially at the region comprised between the residues 64–68 and 77–81. It also had increased flexibility at the N-terminal helix of the PLP-binding domain and an increased flexibility in a region adjacent to the C-terminal helix of the PLP-binding domain (residues 116–120). In addition, this variant presented decreased flexibility especially at the coil regions, except for the region comprised between the residues 36–42, which had an increased flexibility when compared to the wild-type.
The G118V, in turn, presented decreased flexibility in adjacent regions to the actin-binding domains, except for the region comprised between the residues (93–96). It also had an increased flexibility at the N-terminal helix of the PLP-binding domain. Moreover, this variant presented decreased flexibility especially at the coil regions, except for the region comprised between the residues 36–50, which had an increased flexibility when compared to the wild-type PFN1.
Since protein flexibility has a wide influence on the thermodynamics of binding [62,63] the flexibility changes observed in the PLP,actin-binding domain and adjacent regions of PFN1 variants might be related to the known altered binding ability of these variants [16].
The structural flexibility can also be assessed throughout the simulation by analyzing the B-factor [4]. As well as the RMSF, B-factor is useful for describing the flexibility of protein residues [4,64]. The distribution of B-factors along a protein structure is an important indicator of its dynamics [65]. We then projected the B-factor values calculated for each PFN1 residue in the protein surface (Fig 6).
The B-factor for each residue of the PFN1 wild-type and variants represented in a coloring-thickness scheme. Red and bulky structures represent high values and dark blue and thin structures represent low values. (A) B-factor representation of the wild type PFN1. (B) B-factor representation of the C71G variant. (C) B-factor representation of the M114T variant. (D) B-factor representation of the E117G variant. (E) B-factor representation of the G118V variant. (F) Schematic representation of PFN1 structure to further comparison. The PLP binding domain and actin-binding domains of PFN1 are represented in blue and green, respectively. The PFN1 mutation sites are colored red. The dark yellow arrow shows the residue threonine 89.
The C71G variant presented increased flexibility at adjacent regions to the actin-binding domain of PFN1. The M114T variant, in turn, presented decreased flexibility at the actin-binding domain and adjacent regions, as well as increased flexibility at the PLP-binding domain and adjacent regions. The E117G variant presented decreased flexibility in adjacent regions of the actin-binding domain, as well as increased flexibility in adjacent regions of the PLP-binding domain. The G118V variant, in turn, presented decreased flexibility at the actin binding domain and adjacent regions, except for the loop that connects the fifth and sixth beta-strands, which presented increased flexibility when compared to the wild-type. In addition to reaffirming the flexibility alterations observed in the RMSF analysis, B-factor analysis also provided an interesting graphical representation of structural flexibility.
The Rg analysis is useful for describing the overall dimensions of protein structures throughout the simulation [4,52,61]. As shown in Fig 7, the average Rg value of the wild-type structure (1.383±0.02) is similar to those of the C71G (1.379±0.01nm), M114T (1.375±0.01nm), E117G (1.378±0.01nm), and G118V (1.381±0.01nm) variants. These results suggest that the C71G, M114T, E117G, and G118V variants are as compact as the wild–type PFN1
The Rg for the Cα atoms of the wild-type PFN1 and its natural variants at 300 K are shown as a function of time. (A) The wild type is represented in black and variant C71G is represented in red. (B) The wild type is represented in black and variant M114T is represented in blue. (C) The wild type is represented in black and variant E117G is represented in green. (D) The wild type is represented in black and variant G118V is represented in purple.
The stability of protein structures can be assessed throughout the simulation by analyzing the formation of intramolecular hydrogen bonds [66]. As shown in Fig 8, the average number of intramolecular hydrogen bonds formed in the wild-type simulation (101.26±5.85) is similar to those of the C71G (98.67±6.80), M114T (102.04±5.13), E117G (98.50±5.46), and G118V (100.41±6.11) simulations. It suggests that all studied variants are as stable as the wild–type PFN1.
The number of intramolecular Hb formed at 300 K throughout the simulations is shown as a function of time. (A) The wild-type is represented in black and variant C71G is represented in red. (B) The wild type is represented in black and variant M114T is represented in blue. (C) The wild type is represented in black and variant E117G is represented in green. (D) The wild type is represented in black and variant G118V is represented in purple.
The MD analyzes therefore suggested that the studied mutations could affect the PFN1 flexibility at the actin and PLP-binding domains, and, consequently, their intermolecular interactions. It may explain the known altered binding ability of the C71G, M114T, E117G and G118V variants [16]. Moreover, considering that the PFN1 functions are mediated by its actin and PLP-binding ability [9,12], these findings could be also related to the functional impairment of PFN1 upon C71G, M114T, E117G, and G118V mutations (Fig 9), and their involvement in ALS development [9,17].
PFN1 is represented in green, actin monomer is represented in blue, Ena/VASP is represented in orange, and the actin polymer is represented by the blue chained filament. Black arrows indicate the normal PFN1 mechanism of action, while the inhibitory arrow (red) indicates how this mechanism could be disrupted by missense mutations. i) The unbound PFN1 is able to interact with actin monomers. ii) PFN1 interacts through its actin-binding domain with an actin monomer. iii) Upon binding to the actin monomer, PFN1 interacts through its PLP-binding domain with Enabled/vasodilator-stimulated phosphoproteins (Ena/VASP). iv) Ena/VASP, in turn, is responsible for adding the actin monomer captured by PFN1 to the crescent actin filament polymer. v) After the delivery of actin monomer, the PFN1 is released from Ena/VASP. The C71G, M114T, E117G and G118V missense mutations in PFN1 are known to affect the actin, and PLP-binding of BDNF. We proposed that it may occur due to the flexibility alterations at the actin and PLP-binding domains and adjacent residues of PFN1.
PFN1 database
Visualization and analysis of intricate 3D structures of macromolecules, such as proteins, are essential to provide insights into their biological processes [49]. For such purpose, there is a wide range of graphics software and web-based viewers currently available [29,67]. Amongst them, Jmol, which is a widely used open-source viewer of 3D structures [49]. However, this application is falling into disuse because its web-based version is embedded as a Java applet, a plug-in that is no longer supported on many devices and browsers due to security concerns [29,68,69]. In this scenario, JSmol, an HTML5-based equivalent of Jmol [49], comes as a great solution, because it requires no Java applets to run and produces identical graphical results [68]. We, therefore, developed a curated database of human variants using JSmol.
The PFN1 results presented in this paper are stored in SNPMOL, the human-curated database developed by our group (http://www.snpmol.org/). The database is freely available for biologists and clinicians to exploit the PFN1 variants described here and their functional and structural alterations. SNPMOL interface allows users to quickly retrieve and analyze the predicted effects and theoretical models of PFN1 variants. Understanding their effects on PFN1 structure and function may help the development of new drugs and treatments for ALS [22], as well as facilitating the design of further experiments [70].
Conclusions
In this paper, we analyzed the effects of PFN1 nsSNVs using ten functional and stability prediction algorithms, an evolutionary algorithm, and MD simulations. The functional prediction algorithms used here showed high accuracy in detecting the known deleterious potential of the C71G, M114T, and G118V mutations, but not E117G. The functional prediction analysis also showed that it is important to use a variety of algorithms to determine the deleterious effects of mutations. The stability prediction suggested that the ALS-related mutations could destabilize PFN1. The evolutionary conservation analysis indicated that the mutations C71G, M114T, E117G, and G118V occur in highly conserved positions. The MD analyses suggested that the studied mutations could affect the PFN1 flexibility at the actin and PLP-binding domains, and consequently, their intermolecular interactions. It may be therefore related to the functional impairment of PFN1 upon C71G, M114T, E117G and G118V mutations, and their involvement in ALS development. We also developed a human-curated database, SNPMOL (http://www.snpmol.org/), containing the results presented in this paper for biologists and clinicians to exploit PFN1 and its natural variants. Furthermore, we can conclude that computational simulations are an effective approach for the study of disease-related mutations, as well as an important ally of the experimental methods.
References
- 1. Cox LE, Ferraiuolo L, Goodall EF, Heath PR, Higginbottom A, Mortiboys H, et al. Mutations in CHMP2B in lower motor neuron predominant amyotrophic lateral sclerosis (ALS). PLoS One. 2010;5. pmid:20352044
- 2. Callister JB, Pickering-Brown SM. Pathogenesis/genetics of frontotemporal dementia and how it relates to ALS. Exp Neurol. 2014;262: 84–90. pmid:24915640
- 3. Gladman M, Dharamshi C, Zinman L. Economic burden of amyotrophic lateral sclerosis: A Canadian study of out-of-pocket expenses. Amyotroph Lateral Scler Front Degener. 2014;15: 426–432. pmid:25025935
- 4. Krebs BB, De Mesquita JF. Amyotrophic Lateral Sclerosis Type 20—In Silico Analysis and Molecular Dynamics Simulation of hnRNPA1. Xia XG, editor. PLoS One. Public Library of Science; 2016;11: e0158939. pmid:27414033
- 5. Dekker AM, Seelen M, van Doormaal PTC, van Rheenen W, Bothof RJP, van Riessen T, et al. Large-scale screening in sporadic amyotrophic lateral sclerosis identifies genetic modifiers in C9orf72 repeat carriers. Neurobiol Aging. Elsevier Inc; 2016;39: 220.e9-220.e15. pmid:26777436
- 6. Ingre C, Landers JE, Rizik N, Volk AE, Akimoto C, Birve A, et al. A novel phosphorylation site mutation in profilin 1 revealed in a large screen of US, Nordic, and German amyotrophic lateral sclerosis/frontotemporal dementia cohorts. Neurobiol Aging. 2013;34: 1708.e1-6. pmid:23141414
- 7. Bateman A, Martin MJ, O’Donovan C, Magrane M, Alpi E, Antunes R, et al. UniProt: The universal protein knowledgebase. Nucleic Acids Res. 2017;45: D158–D169. pmid:27899622
- 8. Kiaei M, Balasubra M, Govind V, Reis RJS, Moradi M, Varughese KI. ALS-causing mutations in profilin-1 alter its conformational dynamics: A computational approach to explain propensity for aggregation. 2018; 1–10.
- 9. Alkam D, Feldman EZ, Singh A, Kiaei M. Profilin1 biology and its mutation, actin(g) in disease. Cell Mol Life Sci. Springer International Publishing; 2017;74: 967–981. pmid:27669692
- 10. Tanaka Y, Nonaka T, Suzuki G, Kametani F, Hasegawa M. Gain-of-function pro fi lin 1 mutations linked to familial amyotrophic lateral sclerosis cause seed-dependent intracellular TDP-43 aggregation. 2016; 1–14.
- 11. Gau D, Lewis T, Mcdermott L, Wipf P, Koes D, Roy P. Structure-based virtual screening identifies small molecule inhibitor of the profilin1-actin interaction. J Biol Chem. 2017;1: jbc.M117.809137. pmid:29282288
- 12. Ding Z, Gau D, Deasy B, Wells A, Roy P. Both actin and polyproline interactions of profilin-1 are required for migration, invasion and capillary morphogenesis of vascular endothelial cells. Exp Cell Res. 2009;315: 2963–2973. pmid:19607826
- 13. Ding Z, Bae YH, Roy P. Molecular insights on context-specific role of profilin-1 in cell migration. Cell Adhes Migr. 2012;6: 442–449. pmid:23076048
- 14. Yang C, Danielson EW, Qiao T, Metterville J, Brown RH, Landers JE. Mutant PFN1 causes ALS phenotypes and progressive motor neuron degeneration in mice by a gain of toxicity. PNAS. 2016;September: E6209–E6218. pmid:27681617
- 15. Witke W. The role of profilin complexes in cell motility and other cellular processes. Trends Cell Biol. 2004;14: 461–9. pmid:15308213
- 16. Boopathy S, Silvas T V., Tischbein M, Jansen S, Shandilya SM, Zitzewitz JA, et al. Structural basis for mutation-induced destabilization of profilin 1 in ALS. Proc Natl Acad Sci. 2015;112: 7984–7989. pmid:26056300
- 17. Lim L, Kang J, Song J. ALS-causing profilin-1-mutant forms a non-native helical structure in membrane environments. Biochim Biophys Acta—Biomembr. Elsevier; 2017;1859: 2161–2170. pmid:28847504
- 18. Robberecht W, Philips T. The changing scene of amyotrophic lateral sclerosis. Nat Rev Neurosci. 2013;14: 1–17.
- 19. Wu C-H, Fallini C, Ticozzi N, Keagle PJ, Sapp PC, Piotrowska K, et al. Mutations in the profilin 1 gene cause familial amyotrophic lateral sclerosis. Nature. 2012;488: 499–503. pmid:22801503
- 20. Figley MD, Bieri G, Kolaitis R, Taylor JP, Gitler AD. Profilin 1 Associates with Stress Granules and ALS-Linked Mutations Alter Stress Granule Dynamics. 2014;34: 8083–8097. pmid:24920614
- 21. Medinas DB, Valenzuela V, Hetz C. Proteostasis disturbance in amyotrophic lateral sclerosis. Hum Mol Genet. 2017;26: 91–104. pmid:28977445
- 22. De Baets G, Van Durme J, Reumers J, Maurer-Stroh S, Vanhee P, Dopazo J, et al. SNPeffect 4.0: On-line prediction of molecular and structural effects of protein-coding variants. Nucleic Acids Res. 2012;40: D935–D939. pmid:22075996
- 23. Thusberg J, Vihinen M. Pathogenic or not? and if so, then how? Studying the effects of missense mutations using bioinformatics methods. Hum Mutat. 2009;30: 703–714. pmid:19267389
- 24. Roy Choudhury A, Cheng T, Phan L, Bryant SH, Wang Y. Supporting precision medicine by data mining across multi-disciplines: An integrative approach for generating comprehensive linkages between single nucleotide variants (SNVs) and drug-binding sites. Bioinformatics. 2017;33: 1621–1629. pmid:28158543
- 25. Stank A, Kokh DB, Fuller JC, Wade RC. Protein Binding Pocket Dynamics. Acc Chem Res. 2016;49: 809–815. pmid:27110726
- 26. Yan C, Pattabiraman N, Goecks J, Lam P, Nayak A, Pan Y, et al. Impact of germline and somatic missense variations on drug binding sites. Pharmacogenomics J. Nature Publishing Group; 2017;17: 128–136. pmid:26810135
- 27. De Carvalho MDC, De Mesquita JF. Structural Modeling and In Silico Analysis of Human Superoxide Dismutase 2. PLoS One. 2013;8. pmid:23785434
- 28. Moreira LGA, Pereira LC, Drummond PR, De Mesquita JF, Andersen P, Phukan J, et al. Structural and Functional Analysis of Human SOD1 in Amyotrophic Lateral Sclerosis. Le W, editor. PLoS One. Public Library of Science; 2013;8: e81979. pmid:24312616
- 29. Rose PW, Prlić A, Altunkaya A, Bi C, Bradley AR, Christie CH, et al. The RCSB protein data bank: Integrative view of protein, gene and 3D structural information. Nucleic Acids Res. 2017;45: D271–D281. pmid:27794042
- 30. Capriotti E, Calabrese R, Casadio R. Predicting the insurgence of human genetic diseases associated to single point protein mutations with support vector machines and evolutionary information. Bioinformatics. 2006;22: 2729–2734. pmid:16895930
- 31. López-Ferrando V, Gazzo A, De La Cruz X, Orozco M, Gelpí JL. PMut: A web-based tool for the annotation of pathological variants on proteins, 2017 update. Nucleic Acids Res. 2017;45: W222–W228. pmid:28453649
- 32. Adzhubei I, Jordan DM, Sunyaev SR. Predicting Functional Effect of Human Missense Mutations Using PolyPhen-2. Curr Protoc Hum Genet. 2013;7: Unit7.20. pmid:23315928
- 33. Vaser R, Adusumalli S, Leng SN, Sikic M, Ng PC. SIFT missense predictions for genomes. Nat Protoc. Nature Publishing Group; 2015;4: 1073–1081. pmid:26633127
- 34. Bromberg Y, Rost B. SNAP: Predict effect of non-synonymous polymorphisms on function. Nucleic Acids Res. 2007;35: 3823–3835. pmid:17526529
- 35. Capriotti E, Calabrese R, Fariselli P, Martelli PL, Altman RB, Casadio R. WS-SNPs&GO: a web server for predicting the deleterious effect of human protein variants using functional annotation. BMC Genomics. BioMed Central Ltd; 2013;14: S6. pmid:23819482
- 36. Al-Numair NS, Martin ACR. The SAAP pipeline and database: tools to analyze the impact and predict the pathogenicity of mutations. BMC Genomics. 2013;14 Suppl 3: S4. pmid:23819919
- 37. Bao L, Zhou M, Cui Y. nsSNPAnalyzer: Identifying disease-associated nonsynonymous single nucleotide polymorphisms. Nucleic Acids Res. 2005;33: 480–482. pmid:15980516
- 38. Capriotti E, Fariselli P, Casadio R. I-Mutant2.0: Predicting stability changes upon mutation from the protein sequence or structure. Nucleic Acids Res. 2005;33: W306–W310. pmid:15980478
- 39. Ashkenazy H, Abadi S, Martz E, Chay O, Mayrose I, Pupko T, et al. ConSurf 2016: an improved methodology to estimate and visualize evolutionary conservation in macromolecules. Nucleic Acids Res. 2016;44: 344–350. pmid:27166375
- 40. Abraham MJ, Murtola T, Schulz R, Páll S, Smith JC, Hess B, et al. GROMACS: High performance molecular simulations through multi-level parallelism from laptops to supercomputers. SoftwareX. 2015;1: 19–25.
- 41. Gajula KS, Huwe PJ, Mo CY, Crawford DJ, Stivers JT, Radhakrishnan R, et al. High-throughput mutagenesis reveals functional determinants for DNA targeting by activation-induced deaminase. Nucleic Acids Res. 2014;42: 9964–9975. pmid:25064858
- 42. Humphrey W, Dalke A, Schulten K. VMD: visual molecular dynamics. J Mol Graph. 1996;14: 33–8, 27–8. Available: http://www.ncbi.nlm.nih.gov/pubmed/8744570 pmid:8744570
- 43. Lindorff-Larsen K, Piana S, Palmo K, Maragakis P, Klepeis JL, Dror RO, et al. Improved side-chain torsion potentials for the Amber ff99SB protein force field. Proteins Struct Funct Bioinforma. 2010;78: 1950–1958. pmid:20408171
- 44. Petrović D, Wang X, Strodel B. How accurately do force fields represent protein side chain ensembles? Proteins Struct Funct Bioinforma. 2018;86: 935–944. pmid:29790608
- 45. Frezza E, Martin J, Lavery R. A molecular dynamics study of adenylyl cyclase: The impact of ATP and G-protein binding. PLoS One. 2018;13: 1–17. pmid:29694437
- 46. Hess B. P-LINCS: A Parallel Linear Constraint Solver for Molecular Simulation. J Chem Theory Comput. 2008;4: 116–22. pmid:26619985
- 47. Essmann U, Perera L, Berkowitz ML, Darden T, Lee H, Pedersen LG. A smooth particle mesh Ewald method. J Chem Phys. AIP Publishing; 1995;103: 8577.
- 48.
Wickham H. Ggplot2: elegant graphics for data analysis. Springer; 2009.
- 49. Hanson RM, Lu XJ. DSSR-enhanced visualization of nucleic acid structures in Jmol. Nucleic Acids Res. 2017;45: W528–W533. pmid:28472503
- 50. Amberger JS, Hamosh A. in Man (OMIM): A Knowledgebase of Human Genes and Genetic Phenotypes. Curr Protoc Bioinforma. 2017;58. pmid:28654725
- 51. Gau D, Veon W, Zeng X, Yates N, Shroff SG, Koes DR, et al. Threonine 89 is an important residue of profilin-1 that is phosphorylatable by protein kinase A. PLoS One. 2016;11: 1–20. pmid:27228149
- 52. Vinay Kumar C, Kumar KM, Swetha R, Ramaiah S, Anbarasu A. Protein aggregation due to nsSNP resulting in P56S VABP protein is associated with amyotrophic lateral sclerosis. J Theor Biol. Elsevier; 2014;354: 72–80. pmid:24681403
- 53. Karchin R. Next generation tools for the annotation of human SNPs. Brief Bioinform. 2009;10: 35–52. pmid:19181721
- 54. Pereira GRC, Da Silva ANR, Do Nascimento SS, De Mesquita JF. In silico analysis and molecular dynamics simulation of human superoxide dismutase 3 (SOD3) genetic variants. J Cell Biochem. 2018; 1–16. pmid:30206983
- 55. Schymkowitz J, Borg J, Stricher F, Nys R, Rousseau F, Serrano L. The FoldX web server: an online force field. Nucleic Acids Res. 2005;33: W382–8. pmid:15980494
- 56. Knapp B, Frantal S, Cibena M, Schreiner W, Bauer P. Is an intuitive convergence definition of molecular dynamics simulations solely based on the root mean square deviation possible? J Comput Biol. 2011;18: 997–1005. pmid:21702691
- 57. Khan FI, Wei DQ, Gu KR, Hassan MI, Tabrez S. Current updates on computer aided protein modeling and designing. Int J Biol Macromol. Elsevier B.V.; 2016;85: 48–62. pmid:26730484
- 58. Worth CL, Bickerton GRJ, Schreyer A, Forman JR, Cheng TMK, Lee S, et al. A structural bioinformatics approach to the analysis of nonsynonymous single nucleotide polymorphisms (nsSNPs) and their relation to disease. J Bioinform Comput Biol. 2007;5: 1297–318. doi:S0219720007003120 [pii] pmid:18172930
- 59. Kumar CV, Swetha RG, Anbarasu A, Ramaiah S. Computational analysis reveals the association of threonine 118 methionine mutation in PMP22 resulting in CMT-1A. Adv Bioinformatics. 2014;2014: 10. pmid:25400662
- 60. Martinez L. Automatic Identification of Mobile and Rigid Substructures in Molecular Dynamics Simulations and Fractional Structural Fluctuation Analysis. Kleinjung J, editor. PLoS One. 2015;10: e0119264. pmid:25816325
- 61. Kuzmanic A, Zagrovic B. Determination of ensemble-average pairwise root mean-square deviation from experimental B-factors. Biophys J. Biophysical Society; 2010;98: 861–871. pmid:20197040
- 62. Eschweiler JD, Kerr R, Rabuck-gibbons J, Ruotolo BT. Sizing Up Protein–Ligand Complexes: The Rise of Structural Mass Spectrometry Approaches in the Pharmaceutical Sciences. Rev Adv. 2017; 1–20.
- 63. Grünberg R, Nilges M, Leckner J. Flexibility and Conformational Entropy in Protein-Protein Binding. Structure. 2006;14: 683–693. pmid:16615910
- 64. Craveur P, Joseph AP, Esque J, Narwani TJ, Noël F, Shinada N, et al. Protein flexibility in the light of structural alphabets. Front Mol Biosci. 2015;2. pmid:26075209
- 65. Yuan Z, Bailey TL, Teasdale RD. Prediction of protein B-factor profiles. Proteins Struct Funct Genet. 2005;58: 905–912. pmid:15645415
- 66. Pikkemaat MG, Linssen ABM, Berendsen HJC, Janssen DB. Molecular dynamics simulations as a tool for improving protein stability. Protein Eng. 2002;15: 185–192. pmid:11932489
- 67. Herráez A. Biomolecules in the computer: Jmol to the rescue. Biochem Mol Biol Educ. 2006;34: 255–261. pmid:21638687
- 68. Hanson RM, Prilusky J, Renjian Z, Nakane T, Sussman JL. JSmol and the next-generation web-based representation of 3D molecular structure as applied to proteopedia. Isr J Chem. 2013;53: 207–216.
- 69. Shahzad F, Sheltami TR, Shakshuki EM, Shaikh O. A Review of Latest Web Tools and Libraries for State-of-the-art Visualization. Procedia Comput Sci. The Author(s); 2016;58: 100–106.
- 70. Venselaar H, Te Beek TAH, Kuipers RKP, Hekkelman ML, Vriend G. Protein structure analysis of mutations causing inheritable diseases. An e-Science approach with life scientist friendly interfaces. BMC Bioinformatics. 2010;11. pmid:21059217