Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

Exploring novel and potent cell penetrating peptides in the proteome of SARS-COV-2 using bioinformatics approaches

Abstract

Among various delivery systems for vaccine and drug delivery, cell-penetrating peptides (CPPs) have been known as a potent delivery system because of their capability to penetrate cell membranes and deliver some types of cargoes into cells. Several CPPs were found in the proteome of viruses such as Tat originated from human immunodeficiency virus-1 (HIV-1), and VP22 derived from herpes simplex virus-1 (HSV-1). In the current study, a wide-range of CPPs was identified in the proteome of SARS-CoV-2, a new member of coronaviruses family, using in silico analyses. These CPPs may play a main role for high penetration of virus into cells and infection of host. At first, we submitted the proteome of SARS-CoV-2 to CellPPD web server that resulted in a huge number of CPPs with ten residues in length. Afterward, we submitted the predicted CPPs to C2Pred web server for evaluation of the probability of each peptide. Then, the uptake efficiency of each peptide was investigated using CPPred-RF and MLCPP web servers. Next, the physicochemical properties of the predicted CPPs including net charge, theoretical isoelectric point (pI), amphipathicity, molecular weight, and water solubility were calculated using protparam and pepcalc tools. In addition, the probability of membrane binding potential and cellular localization of each CPP were estimated by Boman index using APD3 web server, D factor, and TMHMM web server. On the other hand, the immunogenicity, toxicity, allergenicity, hemolytic potency, and half-life of CPPs were predicted using various web servers. Finally, the tertiary structure and the helical wheel projection of some CPPs were predicted by PEP-FOLD3 and Heliquest web servers, respectively. These CPPs were divided into: a) CPP containing tumor homing motif (RGD) and/or tumor penetrating motif (RXXR); b) CPP with the highest Boman index; c) CPP with high half-life (~100 hour) in mammalian cells, and d) CPP with +5.00 net charge. Based on the results, we found a large number of novel CPPs with various features. Some of these CPPs possess tumor-specific motifs which can be evaluated in cancer therapy. Furthermore, the novel and potent CPPs derived from SARS-CoV-2 may be used alone or conjugated to some sequences such as nuclear localization sequence (NLS) for vaccine and drug delivery.

Introduction

Therapeutic and preventive vaccines are promising approaches to solve health issues globally [1]. Although there are several vaccines for saving millions of lives till now such as vaccines against rubella, mumps, varicella, rotavirus, human papillomavirus (HPV) and hepatitis B virus (HBV), it is required to develop effective vaccines against other pathogens which are incurable and unprotectable [2,3]. In this line, development of effective and novel delivery systems is vital for delivery of vaccine components into cells. In general, delivery systems can be used to transfer different biomolecules into cells including nanoparticles [4], polymers [5], chitosan [6], liposome [7], physical tools [8], and cell penetrating peptides (CPPs) [9,10]. The current focus of developing a novel delivery system has moved to peptide-based delivery systems known as CPPs [11]. CPPs contain 5–50 amino acids in length which can enter cell membranes efficiently and deliver a wide range of cargoes including peptides, proteins, nanoparticles and nucleic acids into cells [12,13]. After discovery of the first CPP, Tat peptide (originated from human immunodeficiency virus type-1 (HIV-1) trans-activating regulatory (Tat) protein), a rapid growth of new CPPs has occurred [14]. The CPPs are natural (e.g., CyLoP-1) or synthetic (e.g., oligoarginine) peptides. These short peptides are heterogeneous in sequence and structure, and can be delivered through endocytosis or direct penetration [1,10,1517]. The mechanism of internalization depends on diverse factors such as CPP sequence, cell type, CPP concentration, temperature, incubation time, and type of cargo [18]. Up to now, a large number of CPPs have been recognized but some of them showed low uptake [19]. The studies demonstrated that prediction of CPPs by bioinformatics tools prior to lab-based experiments could save time and money [20]. For instance, machine-learning-based algorithms permit users to predict CPPs from large sequence data/ proteome. In prediction methods, machine learning models utilize various algorithms including neural network (NN) [21,22], kernel extreme learning machine [23,24], random forest (RF) [25], and support vector machine (SVM) [26,27].

In December 2019, a new member of the coronavirus family was found which firstly named as 2019-nCoV. Then, on February 11, 2020, its name was changed to Coronavirus Disease-2019 (COVID-19) or severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) [28]. SARS-CoV-2 is an enveloped positive single-strand RNA virus that has 10 open reading frames (ORFs). ORF1ab is about 66.66% of virus genome which encodes two large polypeptides such as pp1a and pp1ab. Meanwhile, the ORF2-10 is about 33.33% of virus genome. SARS-CoV-2 genome encodes 28 proteins which are classified into three various classes such as structural proteins, non-structural proteins (nsp), and accessory proteins. The structural proteins are spike (S), nucleoprotein (N), membrane (M), and envelope (E) proteins that form the virus particles. In addition, the non-structural proteins (e.g., nsp1-nsp16) are generated only during the translation of virus RNA in the infected host cell. The accessory proteins possess crucial functions in the assembly, virulence and pathogenesis of the virus [28,29].

Previously, several CPPs were derived from viruses such as Tat (from HIV-1 transcriptional activator protein), C105Y (from HIV-1 glycoprotein 41), MPG (from HIV-1 glycoprotein 41 conjugated to nuclear localization sequence (NLS) from simian virus 40 (SV40)), Pep-1 (from HIV-1 reverse transcriptase conjugated to SV40 NLS), pepR and pepM (originated from Dengue virus), VP22 (originated from Herpes simplex virus (HSV)-1) [14,3034]. Up to now, no complete report has been available on CPPs derived from total proteins of MERS-CoV, SARS-CoV and SARS-CoV-2. Few studies indicated that some peptides of SARS-CoV spike glycoprotein are responsible for membrane fusion or membrane binding activity. For example, the upstream region of the heptad repeat1 (HR1) (residues 892–972) in S2 domain of SARS-CoV spike glycoprotein was involved in membrane fusion. Moreover, some scientists have recognized membrane binding peptides and membrane fusogenic peptides or potential fusion peptides from the upstream region of HR1 (residues 758–890) [35]. Indeed, an efficient membrane fusion mechanism between host cell and SARS-CoV-2 can be responsible for virus infection. Sequence comparison of S protein domains between SARS-CoV-2 and SARS-CoV-1 showed high level of conservation for both S1 and S2 domains. However, variation in the fusogenic regions of S2 domain was observed between SARS-CoV-2 and SARS-CoV-1 [3638]. Hence, due to high potency of SARS-CoV-2 to spread and infect people, we decided to investigate new and potent CPPs in the proteome of this newly isolated virus using in silico approaches.

Materials and methods

Study design

The current study has several main steps to find and characterize novel and potent CPPs as a vaccine and drug delivery system. The flowchart of overall prediction and analysis procedure was illustrated in S1 Fig.

Identification of potential SARS-CoV-2-derived CPPs

Cell penetrating or non-cell penetrating peptides (CPP or non-CPP) could be predicted in the proteome of SARS-CoV-2 using bioinformatics approaches. Hence, to explore novel CPPs, our reference sequence was Wuhan-Hu-1 with GenBank accession number MN908947.3. This strain was isolated from a patient in Wuhan, china. The phylogenetic analysis of whole viral genome contain 29,903 nucleotides that has 89.1% nucleotide similarity to a group of SARS-like coronaviruses (genus Betacoronavirus, subgenus Sarbecovirus) which formerly had been recognized in bats [39].

At first, CellPPD web server (https://webs.iiitd.edu.in/raghava/cellppd/index.html) was applied to determine the cell penetrating peptides. CellPPD is a support vector machine (SVM)-based web server [40,41]. To utilize this web server, the sequences of Spike (S) protein (GenBank ID: QHD43416.1), Membrane (M) glycoprotein (GenBank ID: QHD43419.1), Nucleocapsid (N) phosphoprotein (GenBank ID: QHD43423.2), Envelope (E) protein (GenBank ID: QHD43418.1), Orf1ab polyprotein (GenBank ID: QHD43415.1), ORF3a protein (GenBank ID: QHD43417.1), ORF6 protein (GenBank ID: QHD43420.1), ORF7a protein (GenBank ID: QHD43421.1), ORF8 protein (GenBank ID: QHD43422.1), and ORF10 protein (GenBank ID: QHI42199.1) were submitted to protein scanning tool with default threshold of the SVM-based prediction method (SVM threshold was set at 0.0). Moreover, Tang and colleagues [42] developed a method with an overall prediction accuracy of 83.6%; hence, they established C2Pred web server (http://lin-group.cn/server/C2Pred) to investigate the CPP probability of each peptide.

Uptake efficiency analysis of the identified CPPs

In next step, to evaluate the uptake efficiency of the identified CPPs from previous step, two web servers including CPPred-RF (http://server.malab.cn/CPPred-RF/), and MLCPP (http://www.thegleelab.org/MLCPP/) were used. For this purpose, all of the detected CPPs using CellPPD web server were submitted to these web servers. CPPred-RF is a sequence-based predictor for identifying CPPs and their uptake efficiency. In addition, CPPred-RF built a two-layer prediction framework according to the random forest (RF) algorithm [43]. Manavalan et al. established a two-layer prediction framework termed as machine-learning-based prediction of cell-penetrating peptide (MLCPPs). The first-layer predicts that a submitted peptide is categorized as a CPP or non-CPP. Meanwhile, the second-layer predicts the uptake efficiency of the predicted CPPs [44].

Peptides property calculation

It is crucial to compute the physicochemical properties of peptides for predicting and designing novel and potent CPPs. Therefore, to achieve this aim, we calculated various physicochemical features of CPPs such as net charge, theoretical isoelectric point (pI), amphipathicity, molecular weight (MW), water solubility, hydrophobicity (H), hydrophobicity ratio, and polar-, non-polar-, uncharged- and charged residues. To calculate net charge, theoretical pI, and amphipathicity, CellPPD web server (https://webs.iiitd.edu.in/raghava/cellppd/index.html) was utilized. In addition, protparam tool (https://web.expasy.org/protparam/) was used to compute molecular weight of CPPs. Furthermore, to obtain the water solubility of peptides, Peptide property calculator (PepCalc) (https://pepcalc.com/) was applied. Also, hydrophobicity (H), and polar-, non-polar-, uncharged- and charged residues were estimated using Heliquest web server (https://heliquest.ipmc.cnrs.fr/cgi-bin/ComputParams.py).

Evaluation of membrane-binding ability of CPPs

In order to investigate the potential of binding peptides to membrane, two different methods were utilized. At first, we evaluated the Boman index or protein-binding potential using APD3 web server (http://aps.unmc.edu/AP/prediction/prediction_main.php). The Boman index is the sum of solubility values for all presented amino acids in a peptide sequence and illustrates the potential of a peptide for binding to the membrane or other proteins [45]. Secondly, to evaluate the membrane-binding potential of each peptide, the discrimination factor (D) was calculated [46]. For this purpose, we used Heliquest web server (https://heliquest.ipmc.cnrs.fr/cgi-bin/ComputParams.py) to obtain hydrophobic moment (μH). After determination of hydrophobic moment and also net charge (Z), the D factor was calculated according to the following equation: D = 0.944(<μH>) + 0.33(Z).

In addition, TMHMM web server (http://www.cbs.dtu.dk/services/TMHMM/) was utilized to investigate the cellular localization of CPPs [47]. This web server analyzes the probability of binding a peptide to the bacterial cell membrane (BCM) which possesses negative charge.

Assessment of the immunogenicity

Immunogenicity of the CPPs is one of their disadvantages. It was confirmed that peptides could induce immunologic responses in vivo, resulting in allergic reactions. The existence of peptides in body can stimulate the generation of antibodies which may neutralize therapeutic effects and reduce their efficacy [48, 49]. Hence, to assess the immunogenicity of CPPs, each peptide was submitted to IEDB Immunogenicity Predictor (http://tools.iedb.org/immunogenicity/) [50].

Determination of toxicity and allergenicity

To investigate the toxicity and allergenicity of CPPs, each peptide was submitted to ToxinPred web server (https://webs.iiitd.edu.in/raghava/toxinpred/algo.php), and AllerTop (https://www.ddg-pharmfac.net/AllerTOP/) and AllergenFP (http://ddg-pharmfac.net/AllergenFP/) web servers, respectively [5153].

Estimation of hemolytic potency and half-life

The hemolytic property of peptides was predicted by HemoPI using SVM-based method (https://webs.iiitd.edu.in/raghava/hemopi/design.php). Furthermore, the half-life in E.coli and in mammalian cell was calculated using ProtLifePred web server based on N-end rule (http://protein-n-end-rule.leadhoster.com/) [54].

Prediction of structure

Three dimensional (3D) structure of some predicted CPPs was analyzed by de novo peptide structure prediction server (PEP-FOLD3) (https://bioserv.rpbs.univ-paris-diderot.fr/services/PEP-FOLD3/). PEP-FOLD3 is a de novo method that predicts peptide structures using amino acid sequences. This approach determines the conformation of four consecutive amino acid residues according to structural alphabet (SA) letters [55]. Additionally, the helical wheel diagram of CPPs was defined by Schiffer Edmundson wheel modelling using Heliquest web server (https://heliquest.ipmc.cnrs.fr/cgi-bin/ComputParams.py) [46].

Results

Identification of potential SARS-CoV-2-derived CPPs

To obtain cell penetrating peptides in the proteome of SARS-CoV-2, the sequences of S protein, M glycoprotein, N phosphoprotein, E protein, Orf1ab polyprotein, and ORF3a, ORF6, ORF7a, ORF8 and ORF10 proteins were submitted to protein scanning tools of CellPPD web server. Then, we applied C2Pred web server to achieve the CPP probability of peptides. All of the detected CPPs, and their SVM scores and probability scores were listed in Table 1. No CPP was found in E protein, and only one CPP was identified in ORF6. Meanwhile, Orf1ab had the most CPPs in its proteome. C2pred web server identifies peptides lower than 0.5 as non-CPPs, and peptides greater than 0.5 as CPPs. Although, some peptides were predicted as CPPs by CellPPD, but C2Pred detected them as non-CPPs. For instance, DMSKFPLKLR peptide derived from Orf1ab polyprotein was predicted as CPP by CellPPD with SVM score of 0.11, while C2Pred determined this peptide as non-CPP with score 0.167688.

thumbnail
Table 1. Predicted CPPs and their uptake efficiency using various web servers.

https://doi.org/10.1371/journal.pone.0247396.t001

Uptake efficiency analysis of the identified CPPs

The uptake efficiency of the predicted CPPs was evaluated using two different web servers such as CPPred-RF and MLCPP. These web servers classify CPPs in two categories: high or low uptake efficiency (Table 1).

Calculation of peptide properties

Various physicochemical characteristics of peptides were recognized by diverse web servers such as net charge, pI, MW, amphipathicity, water solubility, hydrophobicity, hydrophobicity ratio, and polar-, non-polar-, uncharged- and charged residues. For instance, a cationic CPP can bind to cell membrane (with negative charge), then can penetrate and deliver cargoes into cells [9]. All of the physicochemical properties of CPPs were determined in Table 2.

thumbnail
Table 2. The properties of peptides determined by diverse web servers and tools.

https://doi.org/10.1371/journal.pone.0247396.t002

Evaluation of membrane-binding potential of CPPs

One of the principal criterions to design a potent CPP is the prediction of membrane-binding ability and cellular localization. Hence, the Boman index of each peptide was estimated using APD3 web server. The values higher than 2.48 kcal/ mol define high binding potential. For example, SSRSRNSSRN peptide derived from N-protein had the highest Boman index amongst all of the predicted CPPs (Boman Index: 7.5). Moreover, the D factor was calculated for each peptide based on net charge and μH. According to the computed D factor, CPPs can be divided into three different categories including D < 0.68 as non-lipid binding (helix/random coil), 0.68 < D < 1.34 as possible lipid-binding helix, and D > 1.34 as lipid-binding helix [46]. Additionally, the cellular localization of each CPP was evaluated by TMHMM server to determine the probability of CPPs which can enter the cell. The results of membrane-binding potential and cellular localization of CPPs were indicated in Table 3. Also, some examples of TMHMM prediction results were illustrated in Fig 1.

thumbnail
Fig 1.

Prediction of cellular localization for CPPs using TMHMM web server: (A) EASKKPRQKR peptide containing tumor penetrating motif (RXXR); (B) GIEFLKRGDK peptide containing tumor homing motif (RGD); (C) RSGARSKQRR peptide with +5.00 net charge; (D) VVLKKLKKSL peptide with high half-life (~100 hour) in mammalian cells; (E) SSRSRNSSRN peptide with the highest Boman index (~7.5); (F) MCYKRNRATR peptide containing tumosr penetrating motif (RXXR). All of the prediction results showed the cell localization of CPPs.

https://doi.org/10.1371/journal.pone.0247396.g001

thumbnail
Table 3. Membrane-binding potential and cellular localization of CPPs.

https://doi.org/10.1371/journal.pone.0247396.t003

Assessment of the immunogenicity

As we mentioned earlier, it is important that CPPs as a delivery system should not have any immune activity. Hence, we analyzed the immunogenicity activity of each peptide using IEDB Immunogenicity Predictor. The results were listed in Table 4.

thumbnail
Table 4. Evaluation of immunogenicity, toxicity, allergenicity, half-life and hemolytic potency.

https://doi.org/10.1371/journal.pone.0247396.t004

Determination of toxicity and allergenicity

The toxicity and allergenicity of each peptide were determined using diverse web servers (Table 4). In detail, most of the predicted CPPs were non-toxic. The toxic CPPs were derived from ORF3a and Orf1ab polyproteins. Furthermore, there are some differences between allergenicity prediction by AllerTop, and AllergenFP web servers. Some CPPs were determined as probable allergen by AllerTop, whereas they were identified as probable non-allergen by AllergenFP. It is rational to select CPPs which were determined as probable non-allergen by both web servers.

Estimation of hemolytic potency and half-life

It should be considered that high hydrophobicity of a peptide enhances its probability of hydrolysis in the host; therefore, the probability of hydrolysis and half-life of each peptide in E.coli and mammalian were evaluated using HemoPI and ProtLifePred web servers (Table 4). The results of hemolytic potency vary between 0 and 1 (i.e., 0 very unlikely to be hemolytic, and 1 very likely to be hemolytic). For example, seven predicted CPPs had the highest half-life in mammalian cells (~ 100 hours) which all of them were derived from Orf1ab polyprotein including VAYRKVLLRK, VVLKKLKKSL, VLKKLKKSLN, VGKPRPPLNR, VVNARLRAKH, VNARLRAKHY, and VLRQWLPTGT peptides.

Prediction of CPP structure

The 3D spatial shapes of CPPs were predicted by PEP-FOLD3 web server (Fig 2). Also, the helical wheel projection of these short peptides were obtained via Heliquest web server as indicated in Fig 3. A peptide comprising at least five adjacent hydrophobic residues (such as Leu, Ile, Ala, Val, Pro, Met, Phe, Trp, and Tyr) illustrates a hydrophobic face on a helical wheel projection [46].

thumbnail
Fig 2.

The 3D spatial shape of CPPs predicted by PEP-FOLD3 web server: (A) EASKKPRQKR peptide containing tumor penetrating motif (RXXR); (B) GIEFLKRGDK peptide containing tumor homing motif (RGD); (C) RSGARSKQRR peptide with +5.00 net charge; (D) VVLKKLKKSL peptide with high half-life (~100 hour) in mammalian cells; (E) SSRSRNSSRN peptide with the highest Boman index (~7.5); (F) MCYKRNRATR peptide containing tumor penetrating motif (RXXR).

https://doi.org/10.1371/journal.pone.0247396.g002

thumbnail
Fig 3.

Helical wheel projection of the six selected CPPs by HeliQuest: These data indicated the possible amphipathic α-helical conformation of the selected CPPs: (A) EASKKPRQKR peptide containing tumor penetrating motif (RXXR); (B) GIEFLKRGDK peptide containing tumor homing motif (RGD); (C) RSGARSKQRR peptide with +5.00 net charge; (D) VVLKKLKKSL peptide with high half-life (~100 hour) in mammalian cells; (E) SSRSRNSSRN peptide with the highest Boman index (~7.5); (F) MCYKRNRATR peptide containing tumor penetrating motif (RXXR). The structural motifs were shown as hydrophobic (yellow) and cationic (blue). Arrow illustrates direction of the hydrophobic moment (μH).

https://doi.org/10.1371/journal.pone.0247396.g003

Discussion

Vaccination is one of the most effective strategies for control of dangerous pathogens. A potent vaccine must stimulate strong humoral and cellular immune responses in host [56]. The vaccine efficacy relies on various factors including the selected antigen, adjuvant and delivery system [57]. Therefore, many researchers have focused on development of novel and powerful delivery systems [9,5860]. Since the discovery of CPPs, these short peptides were considered as a significant delivery system to enter diverse types of cargoes into cells due to their high cellular uptake efficiency. Several viruses such as HIV-1, Influenza A virus subtype H5N1, Dengue virus and HSV-1 contain CPPs in their proteome [11,30,34,61].

The bioinformatics strategies take scientists one step forward in screening and evaluating CPPs. Hence, the current study was planned to screen and identify novel and potent CPPs in the proteome of SARS-CoV-2 using in silico tools. To achieve this aim, we extracted the sequences of S, M, N, E, ORF1ab, ORF3a, ORF6, ORF7a, ORF8, and ORF10 proteins and submitted to CellPPD web server. The CellPPD is a support vector machine (SVM)-based prediction approach which was established to predict highly efficient cell penetrating peptides. The CellPPD method was based on binary profile of peptides that settle the information of both composition and order of residues in peptides [40]. The output of analysis using CellPPD web server was a large number of CPPs which subjected to several web servers for further analysis such as their physicochemical properties, uptake efficiency, toxicity, allergenicity, cellular localization, tendency for binding to plasma membrane, and prediction of 3D structure. Our results showed that the proteome of SARS-CoV-2 contains a large number of cell penetrating peptides. Most of the predicted CPPs were originated from Orf1ab polyprotein. Orf1ab polyprotein forms about two thirds of the SARS-CoV-2 genome that is translated into two polypeptides such as pp1a and pp1b. Next, these two polypeptides are processed and cleaved into sixteen non-structural proteins (nsp). Non-structural proteins possess crucial functions in the replication, transcription and pathogenesis of viral RNA [29]. Despite Orf1ab polyprotein, our data indicated that no CPP was found in the E protein. This protein is responsible for virus production and maturation [28]. Herein, twenty-four CPPs were predicted in spike (S) protein, as well. Furthermore, most of the predicted CPPs in S protein are amphipathic in nature. On the other hand, most of the predicted CPPs showed high uptake efficiency using in silico approaches. The studies demonstrated that several factors affect the uptake efficiency such as the number of arginine, the existence of tryptophan and its affinity to form helical structure, and orientation of tryptophan and arginine around the helix [6264]. In addition, it should be considered that CPPs because of their natural pore-forming propensity and high hydrophobic moment (μH) could damage or destabilize the lipid bilayers irreversibly and so they showed cytotoxic effects. Hence, minimizing μH should be performed to reduce the membrane-disturbing by CPPs [65,66]. Our data indicated that most of the CPPs predicted from the proteome of SARS-CoV-2 were not toxic and allergen, and had appropriate half-life, as well as they could bind to plasma membrane with high potential and subsequently penetrate into cells. For example, Kajiwara et al. showed that H5N1 highly pathogenic avian influenza virus (HPAIV) infects host cells by recruiting CPP activity of the C-terminal domain of HA1 protein (HA314-346) [61]. Moreover, the N-terminal tail of capsid protein (CaP) from the plant-infecting brome mosaic virus (BMV) containing the arginine-rich motif was essential for penetration through cellular membranes [67]. Thus, it is possible that CPPs found in the SARS-CoV-2 proteome possess the potency for virus penetration into host cells.

On the other hand, CPPs are not cell-specific and thus they are internalized in most of the cell types through receptor-independent approach. Hence, to determine CPPs that might be cancer-specific or might enter cancerous cells effectively, the peptide sequence should possess the tumor homing motif (RGD) and/or tumor penetrating motif (RXXR). Moreover, the peptides harboring RXXR motif at their C-terminal region could enter tumor cells through binding to neuropilin receptor which was commonly expressed at the surface of tumor cells [19]. In our study, one of the SARS-CoV-2-derived CPPs (i.e., GIEFLKRGDK) contains RGD motif. This CPP with +1.00 net charge was soluble in water, non-toxic, and its half-life was about 30 hours in mammalian. Its cellular localization was predicted using TMHMM server. Interestingly, two CPPs such as EASKKPRQKR and MCYKRNRATR peptides included RXXR motif at their C-terminal regions. In detail, EASKKPRQKR peptide had +4.00 net charge and good water solubility. This peptide was non-toxic and non-allergen with about one hour half-life in mammalian. Also, the Boman index was 6.04 for this CPP (i.e., the values higher than 2.48 kcal/mol showed high binding potential), and its cellular localization was confirmed by TMHMM web server. Moreover, MCYKRNRATR peptide had +4.00 net charge and good water solubility. But this CPP was predicted as a toxic and allergen peptide with the estimated Boman index about 5.42. Additionally, TMHMM web server predicted its localization inside the cell. Therefore, based on our data, the efficiency of GIEFLKRGDK and EASKKPRQKR peptides can be further evaluated in vitro and in vivo as a delivery system in cancer therapy.

In the present study, only CPPs with 10 residues in length were predicted. As known, CPPs contain 5–50 residues in length [11]. Thus, we can design novel CPPs with more length and higher efficiency by addition of some sequences for delivery of different cargoes. For instance, we can add a hydrophilic lysine-rich domain derived from NLS of SV40 large T-antigen (KKKRKV) and a spacer domain (WSQP) to improve the efficiency of CPPs in DNA delivery as used in other studies [33]. In this study, as an example, by merging 11 overlapped CPPs derived from N protein such as KKSAAEASKK, KSAAEASKKP, SAAEASKKPR, AAEASKKPRQ, AEASKKPRQK, EASKKPRQKR, ASKKPRQKRT, SKKPRQKRTA, KKPRQKRTAT, KPRQKRTATK, and PRQKRTATKA peptides (with net charges of +4.00 and +5.00), a novel CPP was designed with 21 residues in length (i.e., KKSAAEASKKPRQKRTATKAY). This CPP had +7.00 net charge and good water solubility. Moreover, it was non-allergen and non-toxic with immunogenicity score about -0.70123 and D factor about 2.46 which would be located into cells as predicted by TMHMM web server. Surprisingly, when the SV40 large T-antigen NLS sequence and a spacer domain were conjugated to this CPP, we had a new CPP with 31 amino acids in length (i.e., KKSAAEASKKPRQKRTATKAYWSQPKKKRKV), +12.00 net charge, and good water solubility. This peptide was non-allergen and non-toxic with immunogenicity score about -1.49065 and D factor about 4.11, which was localized into cells as predicted by TMHMM web server. Indeed, using the conjugation of NLS and spacer to the designed CPP, the net charge and the probability of cellular localization inside cells were enhanced. Our predicted and designed CPP is similar to MPG CPP (27 residues in length, and +4.00 net charge) composed of peptide derived from HIV-1 glycoprotein 41, SV40 NLS and spacer domain. The MPG peptide was reported for delivery of DNA-based vaccine both in vitro and in vivo [33,68,69].

Conclusion

In conclusion, novel and potent CPPs derived from the proteome of SARS-CoV-2 were identified using in silico methods. It is possible for relationship between these CPPs and rapid spreading the virus in host. Moreover, we designed a long and novel CPP conjugated to SV40 NLS and spacer domain that had high binding ability to membrane and localization inside cells. The designed CPP was similar to MPG CPP. This CPP can be further evaluated for DNA delivery in vitro and in vivo in future. Generally, the predicted and designed CPPs derived from the proteome of SARS-CoV-2 with different properties can be applied to deliver different cargoes in vaccine and drug development.

Supporting information

S1 Fig. The flowchart of overall study plan.

https://doi.org/10.1371/journal.pone.0247396.s001

(TIF)

References

  1. 1. Florindo HF , Kleiner R, Vaskovich-Koubi D, Acúrcio RC, Carreira B, Yeini E, et al. Immune-mediated approaches against COVID-19. Nature Nanotechnology. 2020; 15: 630–645. pmid:32661375
  2. 2. Zareba G. A new combination vaccine for measles, mumps, rubella and varicella. Drugs Today. 2006; 42: 321–329. pmid:16801995
  3. 3. Kardani K, Basimi P, Fekri M, Bolhassani A. Antiviral therapy for the sexually transmitted viruses: recent updates on vaccine development. Expert Rev. Clin. Pharmacol. 2020; 13 (9): 1001–1046.
  4. 4. Garg A, Dewangan HK. Nanoparticles as adjuvants in vaccine delivery. Crit. Rev. Ther. Drug. 2020; 37 (2): 183–204. pmid:32865905
  5. 5. Nevagi RJ, Skwarczynski M, Toth I. Polymers for subunit vaccine delivery. Eur. Polym. J. 2019; 114: 397–410.
  6. 6. Şenel S, Yüksel S. Chitosan-based particulate systems for drug and vaccine delivery in the treatment and prevention of neglected tropical diseases. Drug. Deliv. Transl. Res. 2020; 10: 1644–1674. pmid:32588282
  7. 7. Yu R, Mai Y, Zhao Y, Hou Y, Liu Y, Yang J. Targeting strategies of liposomal subunit vaccine delivery systems to improve vaccine efficacy. J. Drug Target. 2019; 27: 780–789. pmid:30589361
  8. 8. Du X, Wang J, Zhou Q, Zhang L, Wang S, Zhang Z, et al. Advanced physical techniques for gene delivery based on membrane perforation. Drug Deliv. 2018; 25: 1516–1525. pmid:29968512
  9. 9. Bolhassani A, Jafarzade BS, Mardani G. In vitro and in vivo delivery of therapeutic proteins using cell penetrating peptides. Peptides. 2017; 87: 50–63. pmid:27887988
  10. 10. Shahbazi S, Bolhassani A. Comparison of six cell penetrating peptides with different properties for in vitro and in vivo delivery of HPV16 E7 antigen in therapeutic vaccines. Int. Immunopharmacol. 2018; 62: 170–180. pmid:30015237
  11. 11. Kardani K, Milani A, Shabani SH, Bolhassani A. Cell penetrating peptides: the potent multi-cargo intracellular carriers. Expert Opin. Drug Deliv. 2019; 16: 1227–1258. pmid:31583914
  12. 12. Hoffmann K, Milech N, Juraja SM, Cunningham PT, Stone SR, Francis RW, et al. A platform for discovery of functional cell-penetrating peptides for efficient multi-cargo intracellular delivery. Sci. Rep. 2018; 8: 1–6. pmid:29311619
  13. 13. Xia H, Gao X, Gu G, Liu Z, Hu Q, Tu Y, et al. Penetratin-functionalized PEG-PLA nanoparticles for brain drug delivery. Int. J. Pharm. 2012; 436: 840–850. pmid:22841849
  14. 14. Yang J, Luo Y, Shibu MA, Toth I, Skwarczynskia M. Cell-penetrating peptides: Efficient vectors for vaccine delivery. Curr. Drug Deliv. 2019; 16: 430–443. pmid:30760185
  15. 15. Milletti F. Cell-penetrating peptides: classes, origin, and current landscape. Drug Discov. Today. 2012; 17: 850–860. pmid:22465171
  16. 16. Maiolo JR, Ferrer M, Ottinger EA. Effects of cargo molecules on the cellular uptake of arginine-rich cell-penetrating peptides. Biochim. Biophys. Acta Biomembr. 2005; 1712: 161–172. pmid:15935328
  17. 17. Futaki S. Membrane-permeable arginine-rich peptides and the translocation mechanisms. Adv. Drug Deliv. Rev. 2005; 57: 547–558. pmid:15722163
  18. 18. Madani F, Lindberg S, Langel Ü, Futaki S, Gräslund A. Mechanisms of cellular uptake of cell-penetrating peptides. J. Biophys. 2011; 2011: 1–10. pmid:21687343
  19. 19. Gautam A, Sharma M, Vir P, Chaudhary K, Kapoor P, Kumar R, et al. Identification and characterization of novel protein-derived arginine-rich cell-penetrating peptides. Eur. J. Pharm. Biopharm. 2015; 89: 93–106. pmid:25459448
  20. 20. Kardani K, Bolhassani A. CPPsite 2.0: An available database of experimentally validated cell-penetrating peptides predicting their secondary and tertiary structures. J. Mol. Biol. 2020; 166703. pmid:33186582
  21. 21. Liu Z, Cui Y, Xiong Z, Nasiri A, Zhang A, Hu J. DeepSeqPan, a novel deep convolutional neural network model for pan-specific class I HLA-peptide binding affinity prediction. Sci. Rep. 2019; 9: 794. pmid:30692623
  22. 22. Tang J, Fu J, Wang Y, Li B, Li Y, Yang Q, et al. ANPELA: analysis and performance assessment of the label-free quantification workflow for metaproteomic studies. Brief Bioinformatics. 2020; 21: 621–636. pmid:30649171
  23. 23. Li Y, Niu M, Zou Q. ELM-MHC: an improved MHC identification method with extreme learning machine algorithm. J. Proteome Res. 2019; 18: 1392–1401. pmid:30698979
  24. 24. Yin J, Sun W, Li F, Hong J, Li X, Zhou Y, et al. VARIDT 1.0: variability of drug transporter database. Nucleic Acids Res. 2020; 48: D1042–D1050. pmid:31495872
  25. 25. Chen L, Chu C, Huang T, Kong X, Cai YD. Prediction and analysis of cell-penetrating peptides using pseudo-amino acid composition and random forest models. Amino Acids. 2015; 47: 1485–1493. pmid:25894890
  26. 26. Tang H, Su ZD, Wei HH, Chen W, Lin H. Prediction of cell-penetrating peptides with feature selection techniques. Biochem. Biophys. Res. Commun. 2016; 477: 150–154. pmid:27291150
  27. 27. Liu B, Chen J, Guo M, Wang X. Protein remote homology detection and fold recognition based on Sequence-Order Frequency Matrix. TCBB. 2017; 16: 292–300. pmid:29990004
  28. 28. Kardani K, Bolhassani A. Vaccine development against SARS-CoV-2: From virology to vaccine clinical trials. Coronaviruses. 2020; 1: 1–13.
  29. 29. Wabalo EK, Dubiwak AD, Gizaw TS, Kotu UG. Role of structural and functional proteins of SARS-COV-2. GSC. Biol. Pharm. Sci. 2020; 12: 117–129.
  30. 30. Freire JM, Veiga AS, Rego de Figueiredo I, de la Torre BG, Santos NC, Andreu D, et al. Nucleic acid delivery by cell penetrating peptides derived from dengue virus capsid protein: design and mechanism of action. FEBS. J. 2014; 281: 191–215. pmid:24286593
  31. 31. Rhee M, Davis P. Mechanism of uptake of C105Y, a novel cell-penetrating peptide. J. Biol. Chem. 2006; 281: 1233–1240. pmid:16272160
  32. 32. Morris MC, Depollier J, Mery J, Heitz F, Divita G. A peptide carrier for the delivery of biologically active proteins into mammalian cells. Nat. Biotechnol. 2001; 19: 1173–1176. pmid:11731788
  33. 33. Kadkhodayan S, Bolhassani A, Sadat SM, Irani S, Fotouhi F. The efficiency of Tat cell penetrating peptide for intracellular uptake of HIV-1 Nef expressed in E. coli and mammalian cell. Curr. Drug Deliv. 2017; 14(4): 536–542. pmid:27719633
  34. 34. Elliott G, O’Hare P. Intercellular trafficking and protein delivery by a herpes virus structural protein. Cell. 1997; 88: 223–233. pmid:9008163
  35. 35. Chakraborty H, Bhattacharjya S. Mechanistic insights of host cell fusion of SARS-CoV-1 and SARS-CoV-2 from atomic resolution structure and membrane dynamics. Biophysical Chemistry. 2020; 106438. pmid:32721790
  36. 36. Mahajan M, Chatterjee D, Bhuvaneswari K, Pillay S, Bhattacharjya S. NMR structure and localization of a large fragment of the SARS-CoV fusion protein: Implications in viral cell fusion. Biochimica et Biophysica Acta (BBA)-Biomembranes. 2018; 1860: 407–415. pmid:28988778
  37. 37. Mahajan M, Bhattacharjya S. NMR structures and localization of the potential fusion peptides and the pre-transmembrane region of SARS-CoV: Implications in membrane fusion. Biochimica et Biophysica Acta (BBA)-Biomembranes. 2015; 1848: 721–730. pmid:25475644
  38. 38. Kumar S, Maurya VK, Prasad AK, Bhatt MLB, Saxena SK. Structural, glycosylation and antigenic variation between 2019 novel coronavirus (2019-nCoV) and SARS coronavirus (SARS-CoV). Virus Disease. 2020; 31: 13–21. pmid:32206694
  39. 39. Wu F, Zhao S, Yu B, Chen YM, Wang W, Song ZG, et al. A new coronavirus associated with human respiratory disease in China. Nature. 2020; 579: 265–269. pmid:32015508
  40. 40. Gautam A, Chaudhary K, Kumar R, Sharma A, Kapoor P, Tyagi A, et al. In silico approaches for designing highly effective cell penetrating peptides. J. Transl. Med. 2013; 11: 74. pmid:23517638
  41. 41. Gautam A, Chaudhary K, Kumar R, Raghava GP. Computer-aided virtual screening and designing of cell-penetrating peptides. In Cell-Penetrating Peptides 2015; 59–69); Humana Press, New York.
  42. 42. Tang H, Su ZD, Wei HH, Chen W, Lin H. Prediction of cell-penetrating peptides with feature selection techniques. Biochem. Biophys. Res. Commun. 2016; 477: 150–154. pmid:27291150
  43. 43. Wei L, Xing P, Su R, Shi G, Ma ZS, Zou Q. CPPred-RF: a sequence-based predictor for identifying cell-penetrating peptides and their uptake efficiency. J. Proteome Res. 2017; 16: 2044–2053. pmid:28436664
  44. 44. Manavalan B, Subramaniyam S, Shin TH, Kim MO, Lee G. Machine-learning-based prediction of cell-penetrating peptides and their uptake efficiency with improved accuracy. J. Proteome Res. 2018; 17: 2715–2726. pmid:29893128
  45. 45. Boman HG. Antibacterial peptides: basic facts and emerging concepts. J. Intern. Med. 2003; 254: 197–215. pmid:12930229
  46. 46. Gautier R, Douguet D, Antonny B, Drin G. HELIQUEST: a web server to screen sequences with specific α-helical properties. Bioinformatics. 2008; 24: 2101–2102. pmid:18662927
  47. 47. Sonnhammer EL, Von Heijne G, Krogh A. A hidden Markov model for predicting transmembrane helices in protein sequences. InIsmb 1998; 6: 175–182. pmid:9783223
  48. 48. Shankar G, Arkin S, Cocea L, Devanarayan V, Kirshner S, Kromminga A, et al. Assessment and reporting of the clinical immunogenicity of therapeutic proteins and peptides-harmonized terminology and tactical recommendations. AAPS. J. 2014; 16: 658–673. pmid:24764037
  49. 49. Kuriakose A, Chirmule N, Nair P. Immunogenicity of biotherapeutics: causes and association with posttranslational modifications. J. Immunol. Res. 2016; 2016: 1–18. pmid:27437405
  50. 50. Calis JJ, Maybeno M, Greenbaum JA, Weiskopf D, De Silva AD, Sette A, et al. Properties of MHC class I presented peptides that enhance immunogenicity. PLoS Comput. Biol. 2013; 9: e1003266. pmid:24204222
  51. 51. Gupta S, Kapoor P, Chaudhary K, Gautam A, Kumar R, Raghava GP. In silico approach for predicting toxicity of peptides and proteins. PLoS One. 2013; 8: e73957. pmid:24058508
  52. 52. Dimitrov I, Bangov I, Flower DR, Doytchinova I. AllerTOP v. 2-a server for in silico prediction of allergens. J. Mol. Model. 2014; 20: 2278. pmid:24878803
  53. 53. Dimitrov I, Naneva L, Doytchinova I, Bangov I. AllergenFP: allergenicity prediction by descriptor fingerprints. Bioinformatics. 2014; 30: 846–851. pmid:24167156
  54. 54. Chaudhary K, Kumar R, Singh S, Tuknait A, Gautam A, Mathur D, et al. A web server and mobile app for computing hemolytic potency of peptides. Sci. Rep. 2016; 6: 1–3. pmid:28442746
  55. 55. Lamiable A, Thévenet P, Rey J, Vavrusa M, Derreumaux P, Tufféry P. PEP-FOLD3: faster de novo structure prediction for linear peptides in solution and in complex. Nucleic Acids Res. 2016; 44: W449–W454. pmid:27131374
  56. 56. Nabel GJ. Designing tomorrow’s vaccines. N. Engl. J. Med. 2013; 368: 551–560. pmid:23388006
  57. 57. Dong Y, Dai T, Wei Y, Zhang L, Zheng M, Zhou F. A systematic review of SARS-CoV-2 vaccine candidates. Signal Transduct. Target Ther. 2020; 5: 1–4. pmid:32296011
  58. 58. Dhakal S, Renukaradhya GJ. Nanoparticle-based vaccine development and evaluation against viral infections in pigs. Vet Res. 2019; 50: 90. pmid:31694705
  59. 59. Wallis J, Katti P, Martin AM, Hills T, Seymour LW, Shenton DP, et al. A liposome-based cancer vaccine for a rapid and high-titre anti-ErbB-2 antibody response. Eur. J. Pharm. Sci. 2020; 152: 105456.
  60. 60. Zhang M, Hong Y, Chen W, Wang C. Polymers for DNA vaccine delivery. ACS Biomater. Sci. Eng. 2017; 3: 108–125. pmid:33450790
  61. 61. Kajiwara N, Nomura N, Ukaji M, Yamamoto N, Kohara M, Yasui F, et al. Cell-penetrating peptide-mediated cell entry of H5N1 highly pathogenic avian influenza virus. Scientific Reports. 2020; 10: 1–3. pmid:31913322
  62. 62. Rydberg HA, Matson M, Amand HL, Esbjorner EK, Nordén B. Effects of tryptophan content and backbone spacing on the uptake efficiency of cell-penetrating peptides. Biochemistry. 2012; 51: 5531–5539. pmid:22712882
  63. 63. Wender PA, Mitchell DJ, Pattabiraman K, Pelkey ET, Steinman L, Rothbard JB. The design, synthesis, and evaluation of molecules that enable or enhance cellular uptake: peptoid molecular transporters. Proc. Natl. Acad. Sci. 2000; 97: 13003–13008. pmid:11087855
  64. 64. Caesar CE, Esbjörner EK, Lincoln P, Nordén B. Membrane interactions of cell-penetrating peptides probed by tryptophan fluorescence and dichroism techniques: correlations of structure to cellular uptake. Biochemistry. 2006; 45: 7682–7692. pmid:16768464
  65. 65. Hällbrink M, Florén A, Elmquist A, Pooga M, Bartfai T, Langel Ü. Cargo delivery kinetics of cell-penetrating peptides. Biochim. Biophys. Acta Biomembr. 2001; 1515: 101–109. pmid:11718666
  66. 66. Räägel H, Pooga M. Peptide and protein delivery with cell-penetrating peptides. In Peptide and Protein Delivery 2011; 221–246; Academic Press.
  67. 67. Qi X, Droste T, Kao CC. Cell-penetrating peptides derived from viral capsid proteins. Molecular Plant-Microbe Interactions 2011; 24: 25–36. pmid:21138375
  68. 68. Habault J, Poyet JL. Recent advances in cell penetrating peptide-based anticancer therapies. Molecules. 2019; 24: 927. pmid:30866424
  69. 69. Kwon SJ, Han K, Jung S, Lee JE, Park S, Cheon YP, Lim HJ. Transduction of the MPG-tagged fusion protein into mammalian cells and oocytes depends on amiloride-sensitive endocytic pathway. BMC Biotechnology. 2009; 9: 73. pmid:19706197