Abstract
Plant protein phosphatase 2C (PP2C) plays vital roles in responding to various stresses, stimulating growth factors, phytohormones, and metabolic activities in many important plant species. However, the PP2C gene family has not been investigated in the economically valuable plant species sunflower (Helianthus annuus L.). This study used comprehensive bioinformatics tools to identify and characterize the PP2C gene family members in the sunflower genome (H. annuus r1.2). Additionally, we analyzed the expression profiles of these genes using RNA-seq data under four different stress conditions in both leaf and root tissues. A total of 121 PP2C genes were identified in the sunflower genome distributed unevenly across the 17 chromosomes, all containing the Type-2C phosphatase domain. HanPP2C genes are divided into 15 subgroups (A-L) based on phylogenetic tree analysis. Analyses of conserved domains, gene structures, and motifs revealed higher structural and functional similarities within various subgroups. Gene duplication and collinearity analysis showed that among the 53 HanPP2C gene pairs, 48 demonstrated segmental duplications under strong purifying selection pressure, with only five gene pairs showing tandem duplications. The abundant segmental duplication was observed compared to tandem duplication, which was the major factor underlying the dispersion of the PP2C gene family in sunflowers. Most HanPP2C proteins were localized in the nucleus, cytoplasm, and chloroplast. Among the 121 HanPP2C genes, we identified 71 miRNAs targeting 86 HanPP2C genes involved in plant developmental processes and response to abiotic stresses. By analyzing cis-elements, we identified 63 cis-regulatory elements in the promoter regions of HanPP2C genes associated with light responsiveness, tissue-specificity, phytohormone, and stress responses. Based on RNA-seq data from two sunflower tissues (leaf and root), 47 HanPP2C genes exhibited varying expression levels in leaf tissue, while 49 HanPP2C genes showed differential expression patterns in root tissue across all stress conditions. Transcriptome profiling revealed that nine HanPP2C genes (HanPP2C12, HanPP2C36, HanPP2C38, HanPP2C47, HanPP2C48, HanPP2C53, HanPP2C54, HanPP2C59, and HanPP2C73) exhibited higher expression in leaf tissue, and five HanPP2C genes (HanPP2C13, HanPP2C47, HanPP2C48, HanPP2C54, and HanPP2C95) showed enhanced expression in root tissue in response to the four stress treatments, compared to the control conditions. These results suggest that these HanPP2C genes may be potential candidates for conferring tolerance to multiple stresses and further detailed characterization to elucidate their functions. From these candidates, 3D structures were predicted for six HanPP2C proteins (HanPP2C47, HanPP2C48, HanPP2C53, HanPP2C54, HanPP2C59, and HanPP2C73), which provided satisfactory models. Our findings provide valuable insights into the PP2C gene family in the sunflower genome, which could play a crucial role in responding to various stresses. This information can be exploited in sunflower breeding programs to develop improved cultivars with increased abiotic stress tolerance.
Figures
Citation: Akter N, Islam MSU, Rahman MS, Zohra FT, Rahman SM, Manirujjaman M, et al. (2024) Genome-wide identification and characterization of protein phosphatase 2C (PP2C) gene family in sunflower (Helianthus annuus L.) and their expression profiles in response to multiple abiotic stresses. PLoS ONE 19(3): e0298543. https://doi.org/10.1371/journal.pone.0298543
Editor: Shailender Kumar Verma, University of Delhi, INDIA
Received: June 19, 2023; Accepted: January 25, 2024; Published: March 20, 2024
Copyright: © 2024 Akter et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All relevant data are within the manuscript and its Supporting Information files.
Funding: The author(s) received no specific funding for this work.
Competing interests: The authors have declared that no competing interests exist.
1.0 Introduction
Plants are usually exposed to adverse environmental conditions, including high and low temperatures, salinity, water deficit, heavy metals, herbivory (wounding), and pathogen infection. These conditions have all been demonstrated to modulate harm plant growth and development. Plants have evolved various signaling pathways in response to these stresses and transfer stimuli to cellular compartments [1]. Biochemical and physiological reactions in plants are regulated by reversible protein phosphorylation, a crucial protein modification process that plays an essential function in stress signaling, catalyzed by protein phosphatases (PPs) and protein kinases (PKs) [2]. Protein kinases (PKs) have been thoroughly studied and demonstrated as regulatory factors in response to the diversity of biotic and abiotic stresses [3].
Protein phosphatases (PPs) are categorized into two major classes: protein serine (Ser)/threonine (Thr) phosphatases (PSPs) and protein tyrosine phosphatases (PTPs) [4]. Protein phosphatase 1 (PP1) and protein phosphatase 2 (PP2) are two subcategories of PSPs according to biochemical properties using activators and inhibitors. Based on cofactor requirements, PP2 proteins are further classified into protein phosphatases 2A (PP2As), protein phosphatases 2B (PP2B), and protein phosphatases 2C (PP2C). Whereas PP2As have no metal ion requirements for activity, PP2Bs, require a calcium ion (Ca2+) and a magnesium (Mg2+). PP2Cs, on the other hand, require a calcium ion (Ca2+) and a manganese (Mn2+) ion [5]. Moreover, according to amino acid sequences, PSPs can be categorized into two subclasses: phosphoprotein metallophosphatase (PPM) containing Mg2+ or Mn2+ ions, which includes the PP2C group, and phospho-protein phosphatase (PPP), which includes PP1, PP2A, and PP2B groups [6].
PP2C genes are evolutionarily conserved from prokaryotes to eukaryotes and found in animals, plants, fungi, bacteria, and archaea. They modulate stress-signaling mechanisms by reversing stress-induced protein kinase (PK) cascades [7]. PP2Cs are the largest phosphatase gene family (60–65% of all phosphorylases) in plants, with a unique structure containing a conserved catalytic domain at the N-terminus or C-terminus and an un-conserved domain at the opposite region [8]. PP2C genes have various functions in signal transduction pathways for their structural diversity [9]. A total of 80, 62, 78, 131, and 257 PP2C genes have been identified using various bioinformatics techniques in Arabidopsis (Arabidopsis thaliana) [10], strawberry (Fragaria vesca) [11], rice (Oryza sativa) [12], mustard (Brassica rapa) [13], and wheat (Triticum aestivum) [14], respectively. According to the evolutionary relationships, the PP2C gene family in Arabidopsis was classified into ten subgroups (A-J) [15]. Subgroup A includes nine members, 6 of them (ABI1, ABI2, AHG1, HAB1, HAB2, and AHG3/ATPP2CA) negatively modulate ABA signaling and others three (HAI1, HAI2/AIP1, and HAI3) responded individually to stress [16]. Two candidate inhibitors of ABI1 protein phosphatase were identified, and they have the potential to play a role in modulating ABA responses [17]. The PP2C gene family plays crucial roles in the modulation of stress signaling and plant development. In tomatoes, PP2C plays a significant role in fruit maturity by modulating the gene expression of ethylene [18]. However, limited PP2C genes in economically important plant species have been functionally investigated. To our knowledge, no genome-wide identification of the PP2C gene family of H. annuus has been reported. Therefore, it is important to identify and analyse the functions of the PP2C gene family in sunflowers (Helianthus annuus L.).
Sunflower (Helianthus annuus L.) is an essential annual dicot plant from the Asteraceae family native to North America and cultivated worldwide as an oilseed crop [19]. It is the fourth most economically profitable oilseed plant after soybean, rapeseed, and safflower and is used to obtain edible oil for medical purposes and as an ornamental plant [20]. Approximately 47,347,175 tons of sunflowers are produced yearly (www.atlasbig.com). Sunflower has several agricultural benefits, such as fast growth, limited water requirement, and extended flowering period [21]. The phenolic compounds, flavonoid compounds, polyunsaturated fatty acids, and vitamins of sunflower provide antioxidant, antimicrobial, anti-hypertensive, cardiovascular, anti-inflammatory, and wound-healing benefits [22]. Nonetheless, due to the recent worldwide climate-change scenario, many crop species with a high economic value, including sunflower, have been affected by various stresses that severely hampered the yield and oil quality in these crop species [23–25]. Advanced breeding approaches must be developed to overcome the global climate change behavior and meet food security challenges. Comprehensive bioinformatics analysis of the target HanPP2C gene family members and their validation by the expression level within a short period have become more useful in breeding programs to improve this crop species. Previous studies identified the PP2C genes of various economically essential plant genomes, such as 81 PP2C genes (0.29%) were found among 27,029 protein-coding genes of Arabidopsis [10, 26] genome. Moreover, 78 PP2C genes (0.14%) among 56,221 protein-coding genes, 134 PP2C genes (0.29%) among 46,430 protein-coding genes, and 91 PP2C genes (0.26%) among 34,727 protein-coding genes were identified in the rice [10, 27], soybean [28, 29], and tomato [30, 31] genome, respectively.
However, there are limitations to identify HanPP2C gene family and analyze their expression in terms of human resources, time, well-equipped laboratories, and experimental expenditure. Despite extensive laboratory-based research on target gene family members, we can collect substantial genomic information from a variety of important plant species using various comprehensive bioinformatics tools, which reduce labor-inputs, funding-inputs, and time investments. In this study, we used integrated bioinformatics analysis to find out more information about the sunflower PP2C (HanPP2C) genes, such as genome-wide identification, physical and chemical properties, phylogenetic comparison, genomic evolution, gene structure, conserved domain, motifs, gene duplication, chromosome mapping, subcellular localization, cis-acting regulatory elements, tissue-specific expression analysis under various stress conditions, and 3D homology modeling of selected proteins. The findings presented here will build the core foundation for functional investigations on the HanPP2C genes and offer brilliant opportunities to improve this crop species in future breeding programs.
2.0 Materials and methods
2.1 Database search and retrieval of PP2C protein sequences in sunflower (H. annuus) genome
Initially, the A. thaliana PP2C DNA-binding domains were used to retrieve PP2C gene-encoding proteins in the H. annuus (Helianthus annuus r1.2) genome at Phytozome v13 (https://phytozome-next.jgi.doe.gov/) using BLASTp (Protein-basic local alignment search tool) [32], with an expected (E) threshold value of -1, a comparison matrix (BLOSUM62), and other default parameters. The Pfam PP2C domain “PF00481” was also used as a query term to ensure the presence of PP2C proteins. Furthermore, retrieved amino acid sequences were analyzed for conserved PP2C domains using SMART (Simple Modular Architecture Research Tool, http://smart.embl-heidelberg.de/) [33] and the NCBI CDD (Conserved Domain Database) (https://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi) [34] with default parameters. Predicted proteins lacking the PP2C conserved domain (PF00481) were excluded from the candidate list. These genes encoding PP2C domains were renamed according to the order of their physical chromosomal positions.
2.2 Determination of physio-chemical properties of sunflower PP2C proteins
The number of amino acid residues, molecular weight, isoelectric value (pI), instability index, aliphatic index, and grand average of hydropathicity (GRAVY) of PP2C proteins were predicted using the ProtParam online tool (http://web.expasy.org/protparam/) [35].
2.3 Phylogenetic analysis of Arabidopsis and sunflower PP2Cs
A. thaliana and H. annuus PP2C protein sequences were retrieved from Phytozome v13, and a phylogenetic tree was constructed using MEGA11 software [36] with the ClustalW program [37, 38] for sequence alignment (S1 Data). The Maximum Likelihood (ML) method with default parameters was used, except for a 1000 bootstrap value to support branch values and Pearson correction. The constructed tree was then uploaded to iTOL v6.7.4 (https://itol.embl.de/) [39] for appropriate representation.
2.4 Gene structure analysis
To determine the gene structure of PP2Cs, CDS and genomic DNA sequences in FASTA format were retrieved from the Phytozome v13 (S2 and S3 Data). Moreover, the "gf3" file of the sunflower genome data was retrieved from Phytozome v13. Gene Structure Display Server (GSDS v2.0) [40] (available at http://gsds.cbi.pku.edu.cn/) was used to analyze the H. annuus genome.
2.5 Conserved domain and motif analysis
The InterPro database (http://www.ebi.ac.uk/interpro/) was used to predict the PP2C conserved domains, and TBtools software-v1.116 [41] was used to display the results. The structural motifs of the PP2C protein sequences were analyzed using the Multiple EM for Motif Elicitation (MEME) (https://meme-suite.org/meme/tools/meme) (http://meme.nbcr.net/meme/) tools of MEME-suite (https://meme-suite.org/meme/) [42], selecting a maximum number of motifs 20 with other default parameters (S4 Data). MEME and the motif scanning method (MSA), enabled by the MEME web interface, were used to visualize the motifs.
2.6 Gene duplication analysis and synonymous (Ks) and non-synonymous (Ka) substitution ratios calculation
Synonymous (Ks) and non-synonymous (Ka) substitution ratios of the sunflower PP2C gene family were determined using the Ka/Ks calculation tool (http://services.cbu.uib.no/tools/kaks) with HanPP2C CDS sequences of duplicated genes. The rates of molecular evolution were determined for each pair of paralogous genes using the Ka/Ks ratios. Duplication and time of divergence (million years ago, MYA) (T) of the HanPP2C gene were calculated by T = Ks/2λ (λ = 6.5×10−9) [43].
2.7 Collinearity and synteny analysis of the PP2C gene family of sunflower
For collinearity and synteny analysis, gene duplications of H. annuus and A. thaliana PP2C genes were analyzed, and the identified HanPP2C collinear pairs and their collinear pairs with Arabidopsis were illustrated using TBtools version-v1.116 [41].
2.8 Analysis of chromosomal location
The sunflower (H. annuus) "Hannuus_494_r1.0 gf3" file was retrieved from the Phytozome v13 database. Information on the chromosomal length, start, and end points of 121 HanPP2C gene locations were collected using TBtools software version-v1.116 [41]. The distribution of HanPP2C genes across the chromosomes was mapped using the collected information through the MapGene2Chrom web v2 (MG2C) web server (http://mg2c.iask.in/mg2c_v2.0/) [44].
2.9 Prediction of the subcellular localization of PP2c family members of sunflower
The subcellular localization of the PP2C proteins in sunflowers was predicted using the Wolf PSORT (https://wolfpsort.hgc.jp/) online tool [45, 46]. The predicted protein signals of each HanPP2C gene were demonstrated using TBtools version-v1.116 [41].
2.10 Cis-acting regulatory elements analysis of HanPP2C gene promoters
To investigate the cis-acting regulatory elements (CAREs), the 2000 bp upstream promoter region of each HanPP2C sequence was obtained from the Phytozome v13 database. Using the plant CARE database (http://bioinformatics.psb.ugent.be/webtools/plantcare/html/), the CAREs of the PP2C genes were predicted [47]. The predicted CAREs were categorized and illustrated as a heatmap using TBtools version-v1.116 [41]
2.11 Putative microRNA target site analysis
The micro-RNA (miRNA) datasets of sunflowers were downloaded from the plant microRNA encyclopedia (http://pmiren.com/) [48]. The CDS sequences of all sunflower HanPP2C genes were analyzed for sequences complementary to miRNAs (Helianthus annuus (Sunflower), unigene, DFCI Gene Index (HAGI), version 6, released on 2009_05_24) with an expectation value of 5.0 and other default parameters of psRNATarget (https://www.zhaolab.org/psRNATarget/analysis) [49] to identify miRNAs that potentially target the sunflower HanPP2C genes.
2.12 Transcriptomic expression pattern analysis in various stress conditions
Previously generated RNA-seq transcriptomic data of an inbred sunflower line (HA412-HO; PI 642777) seedlings (H. annuus) treated with four abiotic stresses (C = Control, DD = Dry Down, N = Nutrient, P = PEG (Polyethylene glycol), and S = Salt)) in root and leaf tissues, maintained for 20 days after germination to examine the specific expression pattern of PP2C genes were utilized [50]. Treatments were implemented at the V2 stage of sunflower seedling development [51]. Dry-down treatments were performed by repeatedly drying the top-down soil. Sunflower seedlings were surface-watered daily using deionized (DI) water only to induce low-nutrient stress. PEG (polyethylene glycol) at 8.25% by volume and -0.25 MPa [52] osmotic challenge was induced for PEG treatment. In contrast, 100 mM NaCl solution was used for daily watering to investigate salt treatment. Fragments Per Kilobases Per Million Mapping Reads (FPKM) values from RNA-seq data were log2 transformed for expression profiling. Using TBtools version-v1.116 [39], a heatmap was generated to illustrate the expression patterns via hierarchical clustering.
2.13 Homology-based modeling of HanPP2C proteins
The 3D (three-dimensional) homology-based models of sunflower PP2C proteins were predicted using YASARA homology software (version 22.9.24.W.64) [53]. Homology modeling was performed with PSI-BLAST iterations and a PSI-BLAST E-value set to 3 and 0.1, respectively. The alignments per template parameter were set to 5, and the terminal extension and loop number per sample were set to 10 and 50, respectively.
3.0 Results
3.1 Identification of PP2C genes in sunflower genome
This study identified 121 genes encoded PP2C proteins in the sunflower (H. annuus r1.2) genome at Phytozome v13 through BLASTp using Arabidopsis PP2C genes as references. The hidden Markov (HMM) model was used to ensure the presence of the protein phosphatase 2C (PP2C) domain using the SMART and Pfam databases. The identified PP2C genes in the sunflower genome were labeled as HanPP2C1 to HanPP2C121 based on their chromosomal distribution and respective order. The basic information such as gene ID, the amino acid (aa) length, molecular weight (MW), isoelectric point (pI), instability index, and hydropathicity of 121 HanPP2C genes were analyzed (Table 1). The length of proteins varied from 121 aa residues (HanPP2C12) to 1070 aa residues (HanPP2C109), most of which were between 300 aa and 400 aa. MW and pI values ranged from 13.24 kDa (HanPP2C12) to 119.34 kDa (HanPP2C70) and 4.72 (HanPP2C15) to 9.50 (HanPP2C113), respectively. Out of 121 HanPP2C, 77 HanPP2C genes (63.64%) showed an unstable instability index, whereas 44 HanPP2C genes showed a stable instability index. According to the hydropathicity (GRAVY) result, all HanPP2C genes except HanPP2C12, HanPP2C22, and HanPP2C84 were hydrophilic.
3.2 Phylogenetic analysis of Arabidopsis and sunflower PP2Cs
We constructed a phylogenetic tree to determine the relationship between 121 HanPP2C and 80 AtPP2C proteins based on multiple sequence alignment (MSA) (Fig 1). This analysis revealed that each subfamily contained PP2C genes from Arabidopsis and sunflower, and genes of both species tended to form separate branches within each subgroup. As a result, HanPP2C genes from the same sub-clade tended to cluster together, while AtPP2C genes from the same sub-clade also formed clusters. The 121 HanPP2C genes were categorized into 15 subgroups while HanPP2C22, HanPP2C37, HanPP2C45, HanPP2C49, HanPP2C76, HanPP2C93, and HanPP2C106 were not clustered with any subgroup (S5 Data). Based on previous studies, we used the same nomenclature for the 15 common subgroups (A-L) [56]. In this analysis, the genes of groups A, B, and F were divided into subgroups, which are named subgroups A1, A2, B1, B2, F1, and F2. Subgroups A-L (except subgroup K) contain eight, nine, seven, one, seven, twenty, fourteen, eight, five, fifteen, eight, four, six, zero, and two HanPP2C genes, respectively. However, D (20 HanPP2Cs) and G (15 HanPP2C) were relatively larger subgroups, and B2 (1 HanPP2C) and L (2 HanPP2C) were the smallest among the subgroups. The distribution of AtPP2C and HanPP2C genes was comparable except for subgroup K, which contains only AtPP2C genes.
Multiple sequence alignments (MSA) were generated using MEGA 11.0 [36] with full-length protein sequences of 121 HanPP2C and 80 AtPP2C members by the Maximum Likelihood (ML) method. Bootstrap values, with 1000 replications for each branch, were calculated. The HanPP2Cs were categorized into 15 subgroups (A1, A2, B1, B2, C, D, E, F1, F2, G, H, I, J, K, and L), each marked by distinct colors. Meanwhile, the AtPP2C genes were marked by a blue triangle, and a red star marked the HanPP2C genes. Additionally, 7 HanPP2Cs were included in the out-group and are represented without color. The corresponding bootstrap values for each branch are displayed within the respective node clusters.
3.3 Gene structure analysis
To understand the gene structure of HanPP2C, we analyzed the exon-intron structural patterns based on phylogenetic relationships. This analysis is essential as it is a crucial evolutionary indicator for gene families. The structural patterns of HanPP2C genes demonstrated consistency with phylogenetic analysis (Fig 2 and S6 Data). In subgroup C, HanPP2C96 contains the most extended 5’ and 3’ untranslated region (UTR) region, and HanPP2C4, HanPP2C6, HanPP2C8, HanPP2C12, HanPP2C22, HanPP2C23, HanPP2C37, HanPP2C39, HanPP2C40, HanPP2C41, HanPP2C42, HanPP2C43, HanPP2C44, HanPP2C49, HanPP2C56, HanPP2C57, HanPP2C58, HanPP2C60, HanPP2C68, HanPP2C108, HanPP2C110, and HanPP2C119 have no UTR. The number of introns varied from 0 to 19 among the 121 genes, and only four genes (HanPP2C49, HanPP2C57, HanPP2C61, and HanPP2C72) have no introns whereas HanPP2C70 contains 19 introns. In addition, HanPP2C70 in group J has 20 exons, the highest exon number among all groups, and HanPP2C45 has the largest gene segment of 14 kb long. Furthermore, HanPP2C49 and HanPP2C57 have only one exon and no intron and UTR regions, the lowest exon number among all HanPP2C genes. Investigated gene structures revealed that most of the HanPP2C members of the same group have similar numbers of exon/intron but varied in length. The HanPP2C gene structure was highly similar in each group; however, variations in exon/intron patterns were identified in some genes. This result suggests that the HanPP2C genes were relatively conserved in evolution, preserving the integrity of gene structure and could cause a slight change in functions.
Gene structure analyses for HanPP2C genes were carried out using Gene Structure Display Server (GSDS 2.0, http://gsds.cbi.pku.edu.cn/index.php). The lengths of exons and introns for each HanPP2C gene are demonstrated proportionally. Gene families are categorized and colored based on their phylogenetic relationships. For all HanPP2C genes, black lines represent introns, red-bold lines represent exons, and blue-bold lines represent 5’ and 3’ untranslated regions (UTR). The structure of each HanPP2C gene exon/intron is displayed proportionally according to the scale mentioned at the bottom.
3.4 Protein conserved domain analysis of HanPP2C genes
The conserved domain analysis showed that 120 HanPP2C genes (excluding HanPP2C49) contained the typical structural PP2C domain, which is associated with other domains such as PKc-like superfamily, cNMP binding, FHA superfamily, RT-like superfamily, and NADB Rossmann (Fig 3A). Among these, three HanPP2C proteins (HanPP2C70, HanPP2C76, and HanPP2C109) contained PKc-like superfamily domain, whereas HanPP2C109 also exhibited another conserved domain (RT-like superfamily) in addition to the PP2C domain and PKc-like superfamily. Furthermore, HanPP2C22, HanPP2C32, and HanPP2C93 each possessed a single conserved domain (NADB Rossmann, FHA superfamily, cNMP binding domain, respectively) and the typical domain. Consequently, it would be interesting to analyze the diverse biological roles of HanPP2C proteins with these distinctive domains.
The distribution of conserved domains and motifs in HanPP2C proteins (A-B). (A) The positions of each conserved domain are demonstrated in differently colored boxes, with the domain names presented on the inside of the each domain. (B) The identification of conserved motifs in HanPP2C proteins was carried out using the Multiple EM for Motif Elicitation (MEME) (https://meme-suite.org/meme/tools/meme) tools at MEME-suite (https://meme-suite.org/meme/) [42], with a maximum of 20 motifs selected. Each motif is represented by a specific-colored box aligned on the right side of the figure. Different colors indicate individual motifs identified within each protein domain.
3.5 Protein motifs analysis
Conserved motifs in HanPP2C proteins were investigated using MEME tools (Fig 3B; S1 Fig and Table 2). The composition of motifs tended to be assembled based on the phylogeny of this study, indicating that the motif compositions within each HanPP2C subgroup were similar but varied among groups. According to this study, motifs 1, 2, 3, and 7 were present in all subgroups, while some were specific to particular subgroups. For instance, motifs 4, 8, and 18 were absent in subgroups L, C, and D but were present in all other subgroups. Motifs 9, 12, and 14 were found in all subgroups except C and D. In contrast, some motifs were unique to only one or two subgroups. For example, motifs 5 and 6 were exclusively present in subgroup C (excluding HanPP2C110) and D subgroups. Motifs 10 and 13 were identified in subgroups F1, D, and L, D, respectively. Motifs 19 and 20 were present only in subgroups D and H (except HanPP2C12), respectively. This investigation suggests that specific motifs may govern the distinct functions of genes within different subgroups. Furthermore, HanPP2C genes within the same subgroups exhibited similar motif distributions, strongly indicating close evolutionary relationships among these genes.
3.6 Ka/Ks analysis of HanPP2C gene family
To determine the evolutionary relationships and selection pressures acting on the protein-coding HanPP2C gene, we calculated the Ka (nonsynonymous), Ks (synonymous) values, and Ka/Ks ratios for 53 homologous pairs (Fig 4 and S7 Data). These values are essential in the evolutionary analysis of the HanPP2C gene family members, as they play a pivotal role. If the value of Ka/Ks is less than 1, the duplicated gene pairs may have evolved through purifying selection, also known as negative selection. A Ka/Ks value equal to 1 indicates neutral selection, while a Ka/Ks value greater than 1 shows positive selection. This analysis revealed that the Ka/Ks ratio for 53 duplicated pairs of HanPP2C genes ranged from 0.07 to 0.96 (which is less than 1) observed in the HanPP2C74-HanPP2C114 pair, and HanPP2C111-HanPP2C29 pair, respectively indicating their evolution through purifying selection. Here, the divergence time (T = Ks/2s) among 53 pairs of duplicated HanPP2C genes was also analyzed, using a clock-like rate of 6.5×10−9 mutations per synonymous site per year. The results demonstrated that divergence events in HanPP2C genes were estimated to range from 1.29 to 47.10 MYA (Million Years Ago) for the pair HanPP2C44-HanPP2C43 and HanPP2C36-HanPP2C113, respectively. This finding helps to elucidate the extensive evolutionary history of the HanPP2C gene family.
Gene duplication analyses were conducted using TBtools software version-v1.116 [41]. Ka presents the number of nonsynonymous substitutions per nonsynonymous site, while Ks represents the number of synonymous substitutions per site. The ratio of nonsynonymous (Ka) to synonymous (Ks) changes is represented by Ka/Ks.
3.7 Collinearity and synteny analysis of the PP2C gene family of sunflower
The investigation of collinearity and synteny analysis aimed to determine the distinctions in replication and evolutionary relationships within the HanPP2C gene family. Collinearity, a specific type of synteny, requires the conservation of gene order. The study revealed that 53 gene pairs within the HanPP2C gene family showed a collinear relationship, with chromosome 9 belonging to the highest number of collinear genes (14 pairs), while chromosomes 12 and 14 contain the lowest number, only two pairs each (Fig 5A). Notably, among the 53 pairs of genes, genes of each 40 pairs were found to be located within the same subgroup, indicating strong homology among these pairs. The remaining 13 pairs, however, were distributed across different subgroups. Remarkably, within the collinear pairs HanPP2C20-HanPP2C45 and HanPP2C53-HanPP2C106, the genes HanPP2C45 and HanPP2C106 are located in an outgroup. Furthermore, the conserved motifs and gene structures of genes within the same subfamily were nearly identical. Additionally, synteny analysis identified ten pairs of PP2C genes demonstrating homology in Arabidopsis and sunflower (Fig 5B). Remarkably, the highest number of 3 pairs of sunflower homolog genes is located on chromosome 13, while a minimum of 1 pair of genes was distributed in chromosomes 7, 10, and 17, respectively. These results strongly indicate a high degree of homology between AtPP2C and HanPP2C genes.
The collinearity analysis of the PP2C gene family in Sunflower and the synteny analysis of PP2Cs between sunflower and Arabidopsis (A and B). (A) Various colored rectangles represent chromosomes 1–17. Different colored lines represent collinear blocks within the sunflower genome, while the colored lines linked between chromosomes represent segmental and tandem duplicated gene pairs, respectively. (B) The aqua-colored blocks represent the collinear blocks in sunflower, while light green-colored blocks represent the synteny blocks in Arabidopsis. Different colored lines represent the syntenic gene pairs of PP2Cs. The aqua rectangles represent sunflower chromosomes (1–17), and the light green rectangles represent Arabidopsis chromosomes (1–5), respectively.
3.8 Chromosomal distribution and gene duplication events for HanPP2C genes
To visualize the genomic organization of HanPP2C genes, we determined their locations on chromosomes based on information retrieved from the Phytozome v13 genome database. This analysis revealed that 121 HanPP2C genes were unevenly distributed across 17 chromosomes, forming clusters of two or more genes on the same chromosome (S2 Fig). The majority of PP2C genes were located on chromosome 09 (16 genes), followed by chromosomes 10 and 13 (11 genes each), and then Chr07 (9 genes). Chromosomes 01, 03, 05, 08, and 16 each contained eight genes, while the lowest HanPP2C genes were found on Chromosomes 12 and 14 (only two genes). Notably, HanPP2C42 and HanPP2C44 were found to encode similar amino acid sequences but were located on different chromosomal positions within chromosome 07. Generally, no correlation was observed between the number of HanPP2C genes on a chromosome and its length. Furthermore, most genes on the same chromosome did not share a common subclade in the phylogenetic tree, indicating that different HanPP2C genes on a single chromosome can encode proteins with distinct functions. In plant development, genome duplication occurs through two major duplication patterns: segmental and tandem duplication. Tandem duplications refer to closely linked genes found within 200 kb of each other on the same chromosome; otherwise, they were classified as segmental duplications. In this investigation, out of 53 pairs of HanPP2C genes, 48 gene pairs were identified as segmental duplications, denoted by light blue arrows, and five pairs as tandem duplications indicated by light orange lines.
3.9 Prediction of subcellular localization of PP2C family members of sunflower
Proteins are distributed throughout cells, which play a vital role in various physiological processes. The prediction of protein localization sites for the HanPP2C gene family can be demonstrated through subcellular localization prediction, thereby facilitating the analysis of gene functions. The subcellular analysis revealed that the HanPP2C protein signals were localized in various cellular organelles, including the nucleus, chloroplast, cytoplasm, mitochondria, cytoskeleton, peroxisome, Golgi apparatus, vacuole, endoplasmic reticulum, plasma membrane, and extracellular space (Fig 6A). Among these, the highest prediction sites for members were in the cytoplasm at 77.69% (94), followed by the chloroplast at 76.86% (93) and the nucleus at 76.03% (92). Conversely, the lowest protein sites were observed in the Golgi apparatus at 10.74% (13) and the peroxisome at 11.57% (14) among all predicted organelles (Fig 6B). Based on these findings, HanPP2C proteins exhibited organelle-specific localization and could function in different microenvironments. The subcellular localization prediction indicated that most HanPP2C genes may be located within organelles such as chloroplast, nucleus, cytoplasm, and mitochondria, while some might be found in the extracellular space. These results revealed the organelle-specific nature of HanPP2C genes and their functional diversity in various cellular contexts.
(A) A heatmap represents the sub-cellular localization analysis of sunflower PP2C proteins. The names of each HanPP2C protein are shown on the left side of the heatmap, while the terms of the corresponding cellular organelles are shown at the bottom of the heatmap. The intensity of color on the right side of the heatmap indicates the presence of protein signals corresponding to the genes. The cellular organelles include nuclear, mitochondrial, cytoplasmic, chloroplast, cytoskeletal, peroxisomal, Golgi, vacuole, endoplasmic reticulum (E.R.), plasma membrane (P.M.), and extracellular locations. (B) The percentage distribution of sunflower PP2C protein signals across various cellular organelles is represented by a bar diagram. The percentages of protein signals appearing in different cellular organelles are shown on the left side of the diagram. These organelles include nuclear, mitochondrial, cytoplasmic, chloroplast, cytoskeletal, peroxisomal, Golgi, vacuole, endoplasmic reticulum (E.R.), plasma membrane (P.M.), and extracellular locations.
3.10 Cis-acting element analysis in the promoter regions
Cis-acting elements in the promoter region play a crucial role in achieving cell-specific, temporal, and spatial control over protein expression. This validates that gene promoters demonstrating similar expression patterns also contain comparable regulatory elements. The interaction between transcription factors and the cis-acting elements in the promoter region regulates gene transcription levels. In our analysis, we screened 63 cis-regulatory elements, including those responsive to light, tissue-specific, phytohormone, and stress, using the 2000 bp upstream region of the 5′‐UTR sequence of HanPP2C genes, utilizing the Plant CARE database (Fig 7 and S8 and S9 Data). Of these 63 cis-acting elements, 31 were associated with light responsiveness, 16 with tissue-specific expression, 11 with phytohormone responsiveness, and 5 with stress responsiveness.
The names of each HanPP2C gene are shown on the left side of the heatmap. The number of putative cis-acting elements for each HanPP2C gene is displayed on the right side of the heatmap and is represented by four different colors (black = 0, green = 1–6, yellow = 7–12, purple = 13–18, and red = 19–24). Functions associated with cis-acting elements of the corresponding genes, such as light responsiveness, tissue-specific expression, phytohormone responsiveness, and stress responsiveness, are shown at the bottom of the heatmap and denoted by bold lines in red, yellow, green, and blue, respectively.
The light responsive elements such as MRE, Box-4, G-box were found in abundance in the HanPP2C promoter region. Among them, MRE motif was highly expressed in HanPP2C20 followed by HanPP2C29 and HanPP2C90 containing 22, 09, and 09 MRE elements, respectively while Box-4 and G-box showed higher expression pattern in most HanPP2C genes. The tissue-specific responsive, ARE element was found abundantly in HanPP2C84, and HanPP2C101, containing 09 and 13 ARE elements. Furthermore, ABRE was the most abundant phytohormone-responsive cis-element in the HanPP2C promoter region. Among all HanPP2Cs, HanPP2C64 and HanPP2C94 contained the most significant number (13) of ABREs, followed by HanPP2C10 (10), HanPP2C63 (10), and HanPP2C92 (10)., respectively. Among stress-responsive cis-elements, LTR was highly expressed in HanPP2C48, HanPP2C101, and HanPP2C104 containing 05, 08, and 06 LTR elements, respectively. In summary, cis-regulatory elements responsible for light-specific, tissue-specific, phytohormone, and stress response were found abundantly in the promoter region of HanPP2C genes, with these elements ranging from 0 to 22. Therefore, analyzing the cis-elements of PP2C genes in sunflowers can help to identify the functions of HanPP2C genes.
3.11 Putative microRNA target site analysis
The Plant miRNA Encyclopedia (http://pmiren.com/) database was utilized to retrieve microRNA sequences targeting HanPP2C genes [48]. Seventy-one miRNAs were identified, targeting 86 of the 121 HanPP2C genes (S10 Data). None of these miRNAs were found to target the remaining 35 HanPP2C genes. The retrieved miRNAs ranged from 20 to 22 nucleotides, and the number of miRNAs targeting each HanPP2C gene varied from 1–13. Based on our observations, Han-miR172 (27), Han-miR156 (19), Han-miR167 (19), and Han-miR170 (14) were identified as highly abundant miRNAs, each comprising 10 (a-j), 13 (a-m), 5 (a-e) and 7 (a- c, e- h) members, respectively (Table 3). Han-miR172 was found to target six HanPP2C genes: HanPP2C22, HanPP2C38, HanPP2C64, HanPP2C67, HanPP2C71, and HanPP2C118, with HanPP2C22 being particularly targeted by nine types of Han-miR172. In contrast, Han-miR156 targeted four HanPP2C genes, namely HanPP2C14, HanPP2C70, HanPP2C100, and HanPP2C114, with HanPP2C14 being the most highly targeted gene, being targeted by seven types of Han-miR156. Moreover, Han-miR167 targeted four HanPP2C genes such as HanPP2C15, HanPP2C61, HanPP2C72, and HanPP2C84, while seven HanPP2C genes including HanPP2C10, HanPP2C19, HanPP2C32, HanPP2C67, HanPP2C78, HanPP2C95, and HanPP2C111 were found to be targeted by Han-miR170. This finding can be useful for understanding gene regulation and the responsiveness of sunflowers HanPP2C gene family to various stresses.
3.12 Transcriptomic expression pattern analysis in different tissues and stresses
Differential expression profiles of 121 differentially expressed genes (DEGs) of HanPP2C genes at various developmental phases were investigated across four different treatments (control + four stresses) and two different types of tissues (leaf and root) compared to the control, using previously generated RNA-seq data (Fig 8) [50]. The expression patterns of HanPP2C genes were illustrated as a heatmap showing various expression levels on tissues and stresses. Among the identified HanPP2C genes, 47 HanPP2C genes exhibited differential expression in leaf tissue, while 49 HanPP2C genes displayed differential expression in root tissue under various stress conditions compared to the control. Additionally, 23 HanPP2C genes in leaf tissue and 22 HanPP2C genes in root tissue were unexpressed in any treatment, suggesting potential functions in other developmental stages.
The clustering of HanPP2C genes in leaf (CL = Control, DDL = Dry Down, NL = Low nutrient, PL = PEG, and SL = Salt) and root tissue (CR = Control, DDR = Dry Down, NR = Low nutrient, PR = PEG, and SR = Salt) is based on their expression profiles under different abiotic stress treatments [50]. The FPKM values were transformed into the Log2 format and compared with the control. The expression values were clustered and visualized using TBtools version-v1.116 [41]. The color gradient from low to high expression (white to red color) is shown on the right side of the heatmap.
The most DEGs (32) were observed in leaf tissue in the nutrient stress treatment. HanPP2C12, HanPP2C16, HanPP2C80, HanPP2C88, HanPP2C98 exhibited high expression levels, while HanPP2C19, HanPP2C29, HanPP2C59, HanPP2C63, HanPP2C69, HanPP2C73, HanPP2C90, HanPP2C103 were moderately expressed. In contrast, HanPP2C14, HanPP2C25, HanPP2C26, HanPP2C32, HanPP2C36, HanPP2C38, HanPP2C46, HanPP2C47, HanPP2C50, HanPP2C53, HanPP2C54, HanPP2C61, HanPP2C62, HanPP2C67, HanPP2C73, HanPP2C83, HanPP2C106, HanPP2C107, and HanPP2C113 genes showed low expression levels based on the heatmap. HanPP2C12, HanPP2C53, HanPP2C63, and HanPP2C80 were highly expressed in the PEG stress treatment, while HanPP2C53, HanPP2C63, and HanPP2C80 exhibited high expression levels in the salt treatment. Additionally, HanPP2C64, HanPP2C103, and HanPP2C88 were moderately expressed in the dry-down treatment.
In root tissue, the highest number of DEGs (25) was observed in the PEG stress treatment. HanPP2C7, HanPP2C48, HanPP2C53, HanPP2C63, HanPP2C77, HanPP2C80, and HanPP2C106 exhibited higher expression levels in this treatment. Under dry-down, nutrition stress, and salt treatment, 17, 17, and 13 DEGs exhibited varying expression levels, respectively. Five genes (HanPP2C2, HanPP2C47, HanPP2C50, HanPP2C72, and HanPP2C74) showed the highest expression levels in the dry-down treatment. In the nutrition stress treatment, HanPP2C18, HanPP2C12, HanPP2C21, and HanPP2C73 were highly expressed, while under salt treatment, HanPP2C2, HanPP2C53, HanPP2C63, and HanPP2C80 exhibited the highest expression levels. Among the 121 HanPP2C genes with differential expression levels, nine genes (HanPP2C12, HanPP2C36, HanPP2C38, HanPP2C47, HanPP2C48, HanPP2C53, HanPP2C54, HanPP2C59, and HanPP2C73) in leaf tissue and five HanPP2C genes (HanPP2C13, HanPP2C47, HanPP2C48, HanPP2C54, and HanPP2C95) in root tissue were expressed under all four stress treatments compared to the control. The differential expression levels of HanPP2C genes suggest distinct roles and functions in response to various treatments in leaf and root tissues.
3.13 Homology-based modeling of HanPP2C proteins
The prediction of the 3D homology modeling of a protein structure is widely used due to its reliability, sensitive, time- saving, cost-effectiveness, and rapidity when compared with NMR or X-ray diffraction analyses. This alternative emerges from the advancement of in silico analysis tools [67–69]. Further, based on the transcriptomic expression analysis under different stress conditions, a total of six HanPP2C proteins (HanPP2C47, HanPP2C48, HanPP2C53, HanPP2C54, HanPP2C59, and HanPP2C73) were selected to construct the 3D homology models (Fig 9). Additionally, the selection of predicted homology modeling templates was based on higher total Z-scores. In this analysis, 3D structures of six candidate HanPP2C proteins were predicted, providing acceptable models.
Three-dimensional homology-based models of selected sunflower PP2C proteins with their predicted co-factor visualized using the YASARA homology software (version 22.9.24.W.64) [53]. The panels represent A. HanPP2C47 with predicted co-factor (pea green spheres), B. HanPP2C48 with predicted co-factor (purple spheres), C. HanPP2C53 with predicted co-factor (pea green spheres), D. HanPP2C54 with predicted co-factor (purple spheres), E. HanPP2C59 with predicted co-factor (purple spheres), F. HanPP2C73 with predicted co-factor (purple spheres).
4.0 Discussion
Following the completion of the sunflower (H. annuus) whole genome sequencing, several important gene families have been identified at genome level, including the PM H+-ATPase gene family [70], WRKY [71], MAPK [72], OSCA [73], WSD [74], VQ [75], Threlix transcription factor [76], NBS-LRR [77], NAC-TF [78]. In our analysis, we identified a total of 121 genes in sunflower, which was higher than Arabidopsis (80) [10], cucumber (56) [56], rice (78) [10], and maize (97) [79]. This observation suggests that the HanPP2C gene family members are the largest compared to other plant species. Notably, the expansion of PP2C genes varies among species and may be relevant to their adaption to stressful environmental conditions.
According to phylogenetic analysis, the members of the PP2C protein family were classified in 13 subgroups in several plants species such as A. thaliana [10], rice [10], cucumber [56], and wheat [14]. Moreover, in woodland strawberries, and pineapple strawberries, PP2C proteins were clustered into 11 subgroups [80]. However, HanPP2C were clustered into 15 subgroups showing quite dissimilarity to the above species. Additionally, FaPP2C and FvPP2C in woodland, and pineapple strawberries were identified in 10 and 9 subgroups out of 11 subgroups, respectively [80]. Similar findings were observed in this study where HanPP2C proteins were present in 14 out of 15 subgroups except for subgroup K. The presence of HanPP2C and AtPP2C genes within each subfamily, closely linked to genes of the same species, implies the existence of an ancestral set of genes defining each subfamily before the monocot-eudicot separation. Furthermore, the phylogenetic structure of the PP2C family aligns well with the concept of birth and death evolution within the flowering plant lineage [81, 82]. Branches containing more than one HanPP2C or AtPP2C gene likely resulted from gene duplication, whereas branches containing only HanPP2C or AtPP2C genes probably experienced gene loss. Similar birth and death evolution patterns are observed in other gene families, such as MADS-box, involved in plant flower development [83]. The phylogenetic tree results indicate that the closer the grouping, the higher the likelihood of having similar functions. Our findings suggest a consistent paralogous sequence for HanPP2C gene divergences through gene duplication.
We analyzed the gene structure of the HanPP2C family, a crucial indicator of the ancestral relationships among all members of the targeted gene family [84]. The gene structure can be valuable for exploring evolutionary links among organisms or genes [85]. According to our findings, HanPP2C genes within the same group exhibited the same exon-intron structure having 1–20 exons, although some variation in the exon-intron distribution pattern was observed, which can be attributed to various factors. Similar number of exons were previously identified in PP2C gene family of tomato [31].Gene structure patterns of HanPP2C showed the similarities with the PP2C genes of Arabidopsis [10], rice [10], cucumber [56], and woodland and pineapple strawberries [80], indicated the conservation of these gene structures throughout evolution [86]. However, our results showed quite differences in exon-intron number from above species. The intron numbers of PP2C genes in Arabidopsis, rice, woodland strawberries and pineapple strawberries ranged from 0–12, 0–18,0–14, 0–34, respectively [10, 80]. Our findings align with the previous investigations revealing the presence of intron-less genes in PP2C family. The individual genes in the sunflower PP2C family showed structural differences, and the variable exon-intron structure contributes to the diversity of gene functions.
We conducted the domain analysis to detect the types and numbers of conserved domains within the HanPP2C gene family. Among all the domains, the PKc domain was also appeared in Barley PP2C genes along with CAP_ED, PKc_like, MSCRAMM_ClfB, NB-ARC, PLN03200, Arm, LRR and Rx_N domains [87]. The PKc domain contains phosphorylation binding sites, is highly conserved and acts as a converter of external signals into secondary signals within plant cells [88]. Under salt stress conditions, the PKc domain of receptor-like-kinase (RLK) signaling pathways may initiate signal transduction by phosphorylating target proteins or kinases [89]. Further, FHA (forkhead associated domain) was found in Brachypodium distachyon PP2C genes along with S-TKc (ser/thr kinase catalytic domain), and CNB (cyclic nucleotide-binding domain) [90]. The FHA domain is a phosphothreonine recognition module in various signaling proteins, including Arabidopsis kinase-associated protein phosphatase (KAPP). KAPP’s kinase-interacting FHA (KI-FHA) domain functions as a negative regulator in several RLK signaling pathways involved in plant growth, development, and responses to environmental stresses [91].
Furthermore, the diverse distribution of motifs in protein sequences can serve as a potential indicator of the divergence of gene functions within different subgroups. Proteins in the same subgroup demonstrate similar motif distributions, suggesting a close evolutionary relationship. However, few genes lacked specific motifs, and these differences in motif composition may cause functional diversity. Previous studies on motif distribution in Arabidopsis [10], rice [10], and cucumber [56] PP2C genes demonstrated that motifs were particular to only one or more subgroups which support our findings. Notably, the number of identified conserved motifs in HanPP2C (20 motifs) is higher than cucumber (10 motifs) [56], tomato (10 motifs) [31], peanut (10 motifs) [92], Arabidopsis (11 motifs) [10], and Medicago truncatula (15 motifs) [9].
Evaluating selective pressure offers valuable guidance for identifying amino acid sequences within a protein. It is also necessary for analyzing functional residues and structural protein shifts [93]. This study revealed that all 53 pairs of duplicated HanPP2c genes predominantly evolved through purifying selection. Previous investigation on tomato [31] and peanut [92] PP2C genes showed that all duplicated SIPP2C and AhPP2C genes have evolved from purifying selection. However, both purifying and positive selection were identified in the duplicated PP2C genes of woodland strawberries and pineapple strawberries [80]. Furthermore, Ks values were analyzed to determine the divergence period, and the divergence period of these paralogous genes ranged from 0.02 to 0.63 (Ks values) with an average duplication time of 21.72 MYA. These results suggest that they diverged more recently than Arabidopsis (16.10 MYA) [11] and at a faster rate than tomato (31.59 MYA) [31].
The collinearity analysis revealed the presence of 53 homologous HanPP2C gene pairs and suggested that the abundance of PP2C genes in sunflower can be attributed to whole-genome duplication. Collinearity and chromosomal distribution indicated a significant role for segmental duplication in the abundance of the HanPP2C gene family, which is similar to previous findings in rice [10], Arabidopsis [10], and cucumber [56]. It is noteworthy that the number of collinear gene pairs of HanPP2C family is higher than tomato (17 pairs) [31], Arabidopsis (9 pairs) and rice (3 pairs) PP2C family [31]. Moreover, we identified syntenic correlations in 10 PP2C gene pairs between sunflower and Arabidopsis The results of the synteny analysis can be used to highlight the functional and evolutionary relationship between sunflower and Arabidopsis species. However, 12, 6, 5 PP2C gene pairs were previously represented as syntenic pairs between tomato-Arabidopsis, tomato-rice, Arabidopsis-rice, respectively [31]. Our study revealed a higher degree of homology between the HanPP2C and AtPP2C gene families, consistent with previous findings between Arabidopsis and cucumber [56], which showed 59 syntenic gene pairs between 48 AtPP2Cs and 41 CsPP2Cs, further supporting the conception of more significant homology between AtPP2C and CsPP2C genes.
According to chromosomal localization, HanPP2C genes were unevenly scattered across 17 chromosomes while CsPP2C, MtPP2C, OsPP2C and AhPP2C genes of cucumber [56], Medicago truncatula [9], rice [10] and peanut [92] were distributed across 7, 8, 12 and 20 chromosomes, respectively [9, 56]. Chromosomal localization of genes indicates gene duplication, a significant driving force in biological evolution, and it can lead to HanPP2C variation [94]. The HanPP2C genes expand through both segmental and tandem duplication, although tandem duplications are less prevalent. and segmental duplication predominated over tandem duplication. This results align with the previous investigation on the PP2C genes of rice representing a total of 12 segmental and 4 tandem duplicated gene clusters [10]. However, no tandem duplications have been reported in the PP2C gene families of Arachis hypogaea [92], Brachypodium distachyon [90], Medicago trauncatula [9] Cucumis sativus L [56]. Gene duplication plays a primary role in gene expansion, and the increasing number of HanPP2C genes in higher plants may be attributed to domain duplication throughout eukaryotic plant evolution [95]. In the gene replication process, segmental duplication is more favorable for maintaining gene function than tandem repeats [96].
The prediction of subcellular localization helps investigate the roles of gene families more precisely and conveniently. HanPP2C genes were highly expressed in the cytoplasm, chloroplast and nucleus. The PP2C genes of barley [87] and tomato [31], primarily localized in the chloroplast, cytosol and nucleus. In cucumber [56], Brachypodium distachyon [90], Medicago trauncatula [9], woodland strawberries and pineapple strawberries [80] PP2C genes were also highly present in the nucleus, chloroplast, and cytoplasm, consistent with our findings. The subcellular localization prediction suggests that proteins located in the cytoplasm are mainly involved in the cytoskeleton formation, acting as actin regulators. The differential localization of proteins within distinct subcellular compartments indicates significant function differences [97, 98]. Understanding protein subcellular localization provides crucial information for determining protein biological activity. Our analysis shows that HanPP2C genes are likely involved in respiration, photosynthesis, cellular growth, and development processes, as they were predominantly localized in the cytoplasm, chloroplast, and nucleus.
Cis-regulatory elements within gene promoter upstream regions play vital roles in plant stress responses. For example, ABA-responsive elements (ABREs) respond to ABA, dryness, or salt signals [99], while LTR is an essential element for low-temperature regulation and stress responsiveness [100]. The analysis of cis-regulatory elements in the 121 HanPP2C genes revealed the presence of one or more ARE, ABRE, G-Box, MRE, GTI, LTR, MRE, DRE, and other cis-acting elements, indicating a significant link between HanPP2C genes and plant stress responses. Among these, light responsive elements comprised a significant portion of HanPP2C genes, demonstrating the photo-regulated activity of PP2C proteins in sunflower. Light-responsive elements may play a crucial role in the photosynthetic mechanism of sunflower, potentially enhancing productivity, and grain quality [101]. Furthermore, various plant biological processes are associated with predicted tissue-specific motifs such as ARE, CAT-box, AT-rich elements, MBS 1-motif, MSA-like elements, HD-Zip 1, and HD-Zip 3. Plant hormones, also known as growth regulators, have regulatory functions in the growth, development, metabolism, and seed germination activities [102, 103]. In this study, TGA- elements, ABRE, TCA-elements, TATC-box, and GC-motif were predicted in phytohormone responsiveness, responsible for various biological functions involved in the growth and development of sunflower. Moreover, LTR, DRE, MBS, TC-rich repeats, and WUN motifs control the expression of genes associated with stress responsiveness, enabling plants to adapt to harsh conditions [104, 105]. The type 2 protein phosphatase, named HIGHLY ABA-INDUCED PP2C1 (HAI1) interacted with FL7 (FORKED-LIKE7), where HAI1 was suppressed. FL7 enhanced the plant immunity response through the phosphorylation of MPK3 (MITOGEN-ACTIVATED PROTEIN KINASE 3) and MPK6 [106]. Our results showed similarity with previous findings that ABREs (involved in ABA responsiveness) are abundant in the promoter regions of PP2C genes in Arabidopsis and rice [10]. In woodland strawberries, most PP2C genes contain ABA-responsive elements [80]. Moreover, ARE, LTR, MBS, TC-rich repeat ABRE, CGTCA-motif, TATC-motif, TCA element, and TGA-element were also identified as highly expressed cis-regulatory elements in Medicago truncatula consistent with our findings [9].
MicroRNAs play a crucial role as plant regulators, regulating various biological processes, including plant growth and development [107]. Numerous miRNAs have recently been found in various species such as soybean (Glycine max) [108], Brassica napus [109, 110], Arachis hypogaea [111], maize (Zea mays) [112] involved in various developmental and biological processes as well as stress response mechanisms of species. The number of HanPP2C genes targeted miRNA (71 different miRNAs) is significantly higher than AhPP2C genes targeted miRNA of peanut (14 different miRNAs) consisting 5 common miRNAs (miR156, miR159, miR167, miR408, and miR1516) [92]. Among these, the highly abundant miR172 of HanPP2C family has various functions, such as controlling flowering time, transitioning between different plant growth stages, and shifting from vegetative to reproductive stages [57, 58]. In Arabidopsis, miR172 plays a vital role in controlling stem cells’ fate, flowering time, and responding to photoperiod changes [59–61]. The other miRNA, miR156, shows versatile functions in different plant developmental stages and is an essential integrator to respond to multiple stresses [62, 63]. Previous research has looked into the functions of miR156 in response to drought, salt, and cold stresses through microRNA sequencing [113]. In Triticum aestivum, miR156 has increased plant susceptibility to heat stress [114]. Similarly, miR167, another abundant microRNA, primarily regulates plant reproduction [64]. In Oryza sativa, miR167 plays a crucial role in gene expression, auxin response, and overall plant growth and development. The role of miR170 has been exhibited to be induced in response to drought stress in plants [65]. The miR170/SCL (transcription factor) node is involved in the gibberellin signaling pathway, promoting cell elongation during root developmental stages [66]. This suggests the potential involvement of HanPP2C genes in promoting vegetative growth while limiting reproductive and floral development and mediating different abiotic stress responses.
According to previously generated RNA-seq data, we observed significant differences in gene expression patterns in both root and leaf tissues in response to different stress conditions [50]. Furthermore, several HanPP2C genes demonstrated a consistent expression pattern across all four stress treatments, comparable with previous findings in cucumber [56]. Notably, nutrient stress induced the highest number of differentially expressed genes in leaf tissue, suggesting a significant impact on leaf mass fraction (LMF). Conversely, no phenotypic divergence was observed between the control samples and PEG stressed in root tissues, although, PEG stress resulted in the highest number of expressed genes in root tissues. However, many genes were either silenced or showed lower expression levels. In this study, drought stress in leaf tissues and salinity stress in root tissues suppressed the expression of most HanPP2C genes, indicating their potential role as negative regulators. In wheat, TaPP2C59 was also suppressed by salinity stress and ABA signaling, suggesting it may act as a negative regulator in abiotic stress and ABA signaling [115]. Additionally, in maize, the ZmPP2C genes act as negative regulators of drought and salinity stress [116]. However, HanPP2C genes that were highly expressed under all four treatments may be speculated as positive regulators. Previous studies in Arabidopsis identified several AtPP2C genes as positive regulators of salt tolerance and drought stress in peaches [117, 118]. Under salt stress, Na+ is transported out of the cell through the activity of a protein phosphatase. PP2C activates SOS1 (the salt overly sensitive pathway) in corporated with SOS2 to increase salt tolerance in plants [119]. PtPP2C genes in Populus euphratica exhibited a higher expression pattern under cold, drought, and high salt stress [120]. VvPP2C02 in grapes strongly responded to high salt, drought, and ABA signaling [121]. The expression patterns of HanPP2C under all four treatment align with the findings of cucumber [56] and Medicago trauncatula [9]. These findings serve as a foundation for further research on the functions and expression patterns of HanPP2C genes under various stress conditions.
Homology modeling investigations are employed to solve multiple challenges related to protein crystallization, thus providing a pathway to gaining higher structural insights into proteins through in-silico approaches [122]. The results presented here are recommended as the first homology structural predictions for the HanPP2C protein family, which includes HanPP2C47, HanPP2C48, HanPP2C53, HanPP2C54, HanPP2C59, and HanPP2C73. The homology model of HanPP2C47 showed structural similarity to P2C56_ARATH, a key negative modulator of the abscisic acid (ABA) signaling pathway in A. thaliana [123, 124]. The structural model of HanPP2C48 is closely similar to the homology model of PPM1A_HUMAN Protein phosphatase 1A (PP1A) of humans (Homo sapiens), which exhibited broad specificity [125]. HanPP2C53 homology model template closely resembled P2C37_ARATH of A. thaliana, another negative regulator of ABA responses across biological activities [126]. The 3D homology model of HanPP2C54 demonstrated a close similarity to PP2C6 (2C68_ORYSJ) of rice (Oryza sativa), which is associated with the regulation of different abiotic stress responses [127]. The homology model of HanPP2C59 remarkably resembles human PPM1B proteins, a member of the PP2C family of Ser/Thr protein phosphatases. Previous studies have shown that PP2C family members act as negative regulators of cell stress response pathways [128]. However, the predicted homology model of HanPP2C73 also exhibited similarity to the model of Arabidopsis P2C56_ARATH proteins, similar to the HanPP2C47 model. These structural models of candidate HanPP2C proteins could provide valuable insights into understanding the significant biological functions associated with phosphatase activity at the molecular level. However, further detailed studies are suggested to identify the factors regulating phosphatase activity.
5.0 Conclusion
In the present study, we identified 121 HanPP2C genes throughout the sunflower genome, employing a comprehensive bioinformatics analysis distributed across the 17 chromosomes. We observed segmental duplication in 48 gene pairs among the total 53 pairs. The gene structures, conserved domains, and motifs of these 121 HanPP2C genes exhibited remarkable similarity. Analysis of selection pressure and collinearity suggested that HanPP2C genes evolved through purifying selection, maintaining functional stability. Further, cis-acting elements analysis revealed the presence of regulatory elements associated with response to light, tissue specificity, phytohormone, and stress. The expression profiles of the HanPP2C gene family members display varying responses to multiple abiotic stresses including dry conditions, low nutrient availability, PEG-induced stress, and salt treatments, compared to the control conditions. The homology modeling of six candidate HanPP2C proteins provide valuable information to understand their biological functions at molecular level. The findings provide useful insight for future research on stress mechanisms, gene selection, gene function elucidation, and the development of stress-tolerant sunflower cultivars in future breeding programs.
Supporting information
S1 Data. Full-length protein sequences of PP2C gene families of A. thaliana and H. annuus plant species for constructing a phylogenetic tree.
https://doi.org/10.1371/journal.pone.0298543.s001
(TXT)
S2 Data. Full-length coding sequences of HanPP2C gene families of H. annuus plant species.
https://doi.org/10.1371/journal.pone.0298543.s002
(TXT)
S3 Data. Full-length genomic sequences of HanPP2C gene families of H. annuus plant species.
https://doi.org/10.1371/journal.pone.0298543.s003
(TXT)
S4 Data. Full-length protein sequences of HanPP2C gene families of H. annuus plant species.
https://doi.org/10.1371/journal.pone.0298543.s004
(TXT)
S5 Data. Sunflower PP2C gene family distribution among groups based on phylogenetic analysis with Arabidopsis PP2C members.
https://doi.org/10.1371/journal.pone.0298543.s005
(DOCX)
S6 Data. In silico predicted the number of introns and exons in HanPP2C genes.
https://doi.org/10.1371/journal.pone.0298543.s006
(DOCX)
S7 Data. Time of gene duplication estimated for different paralogous pairs of HanPP2C genes based on Ka and Ks values.
https://doi.org/10.1371/journal.pone.0298543.s007
(XLSX)
S8 Data. The upstream promoter region (2.0 kb genomic sequences) of HanPP2C gene families of H. annuus for analysis of cis-acting regulatory elements.
https://doi.org/10.1371/journal.pone.0298543.s008
(TXT)
S9 Data. The predicted cis-acting regulatory elements of the upstream promoter region (2.0 kb genomic sequences) of HanPP2C gene families of H. annuus.
https://doi.org/10.1371/journal.pone.0298543.s009
(XLSX)
S10 Data. miRNA targeted prediction of HanPP2C.
The miRNA data was downloaded from the plant micro-RNA encyclopedia (http://pmiren.com/).
https://doi.org/10.1371/journal.pone.0298543.s010
(DOCX)
S1 Fig. The sequence logos of 20 motifs present in H. annuus HanPP2C proteins.
https://doi.org/10.1371/journal.pone.0298543.s011
(TIF)
S2 Fig. The chromosomal locations and duplications of sunflower HanPP2C genes.
The number of distinct chromosomes is at the top of each chromosome bar. The chromosome-scale is in millions of bases (Mb), indicating the length of each chromosome on the left, using the information retrieved from Phytozome v13. Light orange lines indicate tandem duplications, while light blue lines indicate segmental duplications.
https://doi.org/10.1371/journal.pone.0298543.s012
(TIF)
Acknowledgments
We are immensely grateful to Max H. Barnhart, The Burke Lab, Department of Plant Biology, University of Georgia, Athens, GA 30602, United States, for providing the sunflower RNA-seq data obtained from different stress treatments, which we have used in this study for transcriptomic profiling of the candidate PP2C genes of sunflower. The authors wish to thank Mr. Tanzir Ahmed, Assistant Professor, Department of English, Faculty of Arts and Social Science, Jashore University of Science and Technology, Jashore 7408, Bangladesh for extensively editing the manuscript to avoid grammatical errors. The authors also highly appreciate the honorable potential reviewers and the editorial panel members for their critical comments and suggestions for improving this manuscript’s quality.
References
- 1. Zhu JK. Abiotic Stress Signaling and Responses in Plants. Cell. 2016;167(2):313–24. Epub 2016/10/08. pmid:27716505; PubMed Central PMCID: PMC5104190.
- 2. Luan S. Protein phosphatases and signaling cascades in higher plants. Trends in plant science. 1998;3(7):271–5.
- 3. Kulik A, Wawer I, Krzywińska E, Bucholc M, Dobrowolska G. SnRK2 protein kinases—key regulators of plant response to abiotic stresses. Omics. 2011;15(12):859–72. Epub 2011/12/06. pmid:22136638; PubMed Central PMCID: PMC3241737.
- 4. Ehsan M, Wang W, Gadahi JA, Hasan MW, Lu M, Wang Y, et al. The Serine/Threonine-Protein Phosphatase 1 From Haemonchus contortus Is Actively Involved in Suppressive Regulatory Roles on Immune Functions of Goat Peripheral Blood Mononuclear Cells. Front Immunol. 2018;9:1627. Epub 2018/08/01. pmid:30061894; PubMed Central PMCID: PMC6054924.
- 5. Luan S. Protein phosphatases in plants. Annu Rev Plant Biol. 2003;54:63–92. Epub 2003/09/25. pmid:14502985.
- 6. Rogers JP, Beuscher AEt, Flajolet M, McAvoy T, Nairn AC, Olson AJ, et al. Discovery of protein phosphatase 2C inhibitors by virtual screening. J Med Chem. 2006;49(5):1658–67. Epub 2006/03/03. pmid:16509582; PubMed Central PMCID: PMC2538531.
- 7. Fuchs S, Grill E, Meskiene I, Schweighofer A. Type 2C protein phosphatases in plants. Febs j. 2013;280(2):681–93. Epub 2012/06/26. pmid:22726910.
- 8. Schweighofer A, Hirt H, Meskiene I. Plant PP2C phosphatases: emerging functions in stress signaling. Trends Plant Sci. 2004;9(5):236–43. Epub 2004/05/08. pmid:15130549.
- 9. Yang Q, Liu K, Niu X, Wang Q, Wan Y, Yang F, et al. Genome-wide Identification of PP2C Genes and Their Expression Profiling in Response to Drought and Cold Stresses in Medicago truncatula. Sci Rep. 2018;8(1):12841. Epub 2018/08/29. pmid:30150630; PubMed Central PMCID: PMC6110720.
- 10. Xue T, Wang D, Zhang S, Ehlting J, Ni F, Jakab S, et al. Genome-wide and expression analysis of protein phosphatase 2C in rice and Arabidopsis. BMC Genomics. 2008;9:550. Epub 2008/11/22. pmid:19021904; PubMed Central PMCID: PMC2612031.
- 11. Haider MS, Khan N, Pervaiz T, Zhongjie L, Nasim M, Jogaiah S, et al. Genome-wide identification, evolution, and molecular characterization of the PP2C gene family in woodland strawberry. Gene. 2019;702:27–35. Epub 2019/03/21. pmid:30890476.
- 12. Singh A, Giri J, Kapoor S, Tyagi AK, Pandey GK. Protein phosphatase complement in rice: genome-wide identification and transcriptional analysis under abiotic stress conditions and reproductive development. BMC Genomics. 2010;11:435. Epub 2010/07/20. pmid:20637108; PubMed Central PMCID: PMC3091634.
- 13. Khan N, Ke H, Hu CM, Naseri E, Haider MS, Ayaz A, et al. Genome-Wide Identification, Evolution, and Transcriptional Profiling of PP2C Gene Family in Brassica rapa. Biomed Res Int. 2019;2019:2965035. Epub 2019/05/11. pmid:31073524; PubMed Central PMCID: PMC6470454.
- 14. Yu X, Han J, Wang E, Xiao J, Hu R, Yang G, et al. Genome-Wide Identification and Homoeologous Expression Analysis of PP2C Genes in Wheat (Triticum aestivum L.). Front Genet. 2019;10:561. Epub 2019/06/30. pmid:31249596; PubMed Central PMCID: PMC6582248.
- 15. Wang G, Sun X, Guo Z, Joldersma D, Guo L, Qiao X, et al. Genome-wide Identification and Evolution of the PP2C Gene Family in Eight Rosaceae Species and Expression Analysis Under Stress in Pyrus bretschneideri. Front Genet. 2021;12:770014. Epub 2021/12/04. pmid:34858482; PubMed Central PMCID: PMC8632025.
- 16. Rubio S, Rodrigues A, Saez A, Dizon MB, Galle A, Kim TH, et al. Triple loss of function of protein phosphatases type 2C leads to partial constitutive response to endogenous abscisic acid. Plant Physiol. 2009;150(3):1345–55. Epub 2009/05/22. pmid:19458118; PubMed Central PMCID: PMC2705020.
- 17. Janicki M, Marczak M, Cieśla A, Ludwików A. Identification of Novel Inhibitors of a Plant Group A Protein Phosphatase Type 2C Using a Combined In Silico and Biochemical Approach. Front Plant Sci. 2020;11:526460. Epub 2020/10/13. pmid:33042170; PubMed Central PMCID: PMC7524867.
- 18. Liang B, Sun Y, Wang J, Zheng Y, Zhang W, Xu Y, et al. Tomato Protein Phosphatase 2C (SlPP2C3) negatively regulates fruit ripening onset and fruit gloss. 2020:2020.05. 25.114587.
- 19. Adeleke BS, Babalola OO. Oilseed crop sunflower (Helianthus annuus) as a source of food: Nutritional and health benefits. Food Sci Nutr. 2020;8(9):4666–84. Epub 2020/10/01. pmid:32994929; PubMed Central PMCID: PMC7500752.
- 20. Radanović A, Miladinović D, Cvejić S, Jocković M, Jocić S. Sunflower Genetics from Ancestors to Modern Hybrids-A Review. Genes (Basel). 2018;9(11). Epub 2018/11/02. pmid:30380768; PubMed Central PMCID: PMC6265698.
- 21. Merah O, Langlade N, Alignan M, Roche J, Pouilly N, Lippi Y, et al. Genetic analysis of phytosterol content in sunflower seeds. Theor Appl Genet. 2012;125(8):1589–601. Epub 2012/07/25. pmid:22824968.
- 22. Guo S, Ge Y, Na Jom K. A review of phytochemistry, metabolite changes, and medicinal uses of the common sunflower seed and sprouts (Helianthus annuus L.). Chem Cent J. 2017;11(1):95. Epub 2017/11/01. pmid:29086881; PubMed Central PMCID: PMC5622016.
- 23. Adiredjo AL, Navaud O, Muños S, Langlade NB, Lamaze T, Grieu P. Genetic control of water use efficiency and leaf carbon isotope discrimination in sunflower (Helianthus annuus L.) subjected to two drought scenarios. PLoS One. 2014;9(7):e101218. Epub 2014/07/06. pmid:24992022; PubMed Central PMCID: PMC4081578.
- 24. Anastasi U, Santonoceto C, Giuffrè A, Sortino O, Gresta F, Abbate VJFCR. Yield performance and grain lipid composition of standard and oleic sunflower as affected by water supply. 2010;119(1):145–53.
- 25. Ghobadi M, Taherabadi S, Ghobadi M-E, Mohammadi G-R, Jalali-Honarmand SJIC, Products. Antioxidant capacity, photosynthetic characteristics and water relations of sunflower (Helianthus annuus L.) cultivars in response to drought stress. 2013;50:29–38.
- 26. Swarbreck D, Wilks C, Lamesch P, Berardini TZ, Garcia-Hernandez M, Foerster H, et al. The Arabidopsis Information Resource (TAIR): gene structure and function annotation. Nucleic Acids Res. 2008;36(Database issue):D1009–14. Epub 2007/11/08. pmid:17986450; PubMed Central PMCID: PMC2238962.
- 27. Sang J, Zou D, Wang Z, Wang F, Zhang Y, Xia L, et al. IC4R-2.0: Rice Genome Reannotation Using Massive RNA-seq Data. Genomics Proteomics Bioinformatics. 2020;18(2):161–72. Epub 2020/07/20. pmid:32683045; PubMed Central PMCID: PMC7646092.
- 28. Fan K, Chen Y, Mao Z, Fang Y, Li Z, Lin W, et al. Pervasive duplication, biased molecular evolution and comprehensive functional analysis of the PP2C family in Glycine max. BMC Genomics. 2020;21(1):465. Epub 2020/07/08. pmid:32631220; PubMed Central PMCID: PMC7339511.
- 29. Schmutz J, Cannon SB, Schlueter J, Ma J, Mitros T, Nelson W, et al. Genome sequence of the palaeopolyploid soybean. Nature. 2010;463(7278):178–83. Epub 2010/01/16. pmid:20075913.
- 30. The tomato genome sequence provides insights into fleshy fruit evolution. Nature. 2012;485(7400):635–41. Epub 2012/06/05. pmid:22660326; PubMed Central PMCID: PMC3378239.
- 31. Qiu J, Ni L, Xia X, Chen S, Zhang Y, Lang M, et al. Genome-Wide Analysis of the Protein Phosphatase 2C Genes in Tomato. Genes (Basel). 2022;13(4). Epub 2022/04/24. pmid:35456410; PubMed Central PMCID: PMC9032827.
- 32. Goodstein DM, Shu S, Howson R, Neupane R, Hayes RD, Fazo J, et al. Phytozome: a comparative platform for green plant genomics. Nucleic Acids Res. 2012;40(Database issue):D1178–86. Epub 2011/11/24. pmid:22110026; PubMed Central PMCID: PMC3245001.
- 33. Letunic I, Khedkar S, Bork P. SMART: recent updates, new developments and status in 2020. Nucleic Acids Res. 2021;49(D1):D458–d60. Epub 2020/10/27. pmid:33104802; PubMed Central PMCID: PMC7778883.
- 34. Lu S, Wang J, Chitsaz F, Derbyshire MK, Geer RC, Gonzales NR, et al. CDD/SPARCLE: the conserved domain database in 2020. Nucleic Acids Res. 2020;48(D1):D265–d8. Epub 2019/11/30. pmid:31777944; PubMed Central PMCID: PMC6943070.
- 35.
Gasteiger E, Hoogland C, Gattiker A, Duvaud Se, Wilkins MR, Appel RD, et al. Protein identification and analysis tools on the ExPASy server: Springer; 2005.
- 36. Tamura K, Stecher G, Kumar S. MEGA11: Molecular Evolutionary Genetics Analysis Version 11. Mol Biol Evol. 2021;38(7):3022–7. Epub 2021/04/24. pmid:33892491; PubMed Central PMCID: PMC8233496.
- 37. Thompson JD, Higgins DG, Gibson TJ. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 1994;22(22):4673–80. Epub 1994/11/11. pmid:7984417; PubMed Central PMCID: PMC308517.
- 38. Thompson JD, Gibson TJ, Higgins DGJCpib. Multiple sequence alignment using ClustalW and ClustalX. 2003;(1):2.3. 1–2.3. 22.
- 39. Letunic I, Bork P. Interactive Tree Of Life (iTOL) v5: an online tool for phylogenetic tree display and annotation. Nucleic Acids Res. 2021;49(W1):W293–w6. Epub 2021/04/23. pmid:33885785; PubMed Central PMCID: PMC8265157.
- 40. Hu B, Jin J, Guo AY, Zhang H, Luo J, Gao G. GSDS 2.0: an upgraded gene feature visualization server. Bioinformatics. 2015;31(8):1296–7. Epub 2014/12/17. pmid:25504850; PubMed Central PMCID: PMC4393523.
- 41. Chen C, Chen H, Zhang Y, Thomas HR, Frank MH, He Y, et al. TBtools: An Integrative Toolkit Developed for Interactive Analyses of Big Biological Data. Mol Plant. 2020;13(8):1194–202. Epub 2020/06/26. pmid:32585190.
- 42. Bailey TL, Johnson J, Grant CE, Noble WS. The MEME Suite. Nucleic Acids Res. 2015;43(W1):W39–49. Epub 2015/05/09. pmid:25953851; PubMed Central PMCID: PMC4489269.
- 43. Lynch M, Conery JS. The evolutionary fate and consequences of duplicate genes. Science. 2000;290(5494):1151–5. Epub 2000/11/10. pmid:11073452.
- 44. Chao J, Li Z, Sun Y, Aluko OO, Wu X, Wang Q, et al. MG2C: A user-friendly online tool for drawing genetic maps. 2021;1(1):1–4. pmid:37789491
- 45. Xiong E, Zheng C, Wu X, Wang WJPmbr. Protein subcellular location: the gap between prediction and experimentation. 2016;34:52–61.
- 46.
Horton P, Park K-J, Obayashi T, Nakai K, editors. Protein subcellular localization prediction with WoLF PSORT. Proceedings of the 4th Asia-Pacific bioinformatics conference; 2006: World Scientific.
- 47. Rombauts S, Déhais P, Van Montagu M, Rouzé P. PlantCARE, a plant cis-acting regulatory element database. Nucleic Acids Res. 1999;27(1):295–6. Epub 1998/12/10. pmid:9847207; PubMed Central PMCID: PMC148162.
- 48. Guo Z, Kuang Z, Wang Y, Zhao Y, Tao Y, Cheng C, et al. PmiREN: a comprehensive encyclopedia of plant miRNAs. Nucleic Acids Res. 2020;48(D1):D1114–d21. Epub 2019/10/12. pmid:31602478; PubMed Central PMCID: PMC6943064.
- 49. Samad AFA, Sajad M, Nazaruddin N, Fauzi IA, Murad AMA, Zainal Z, et al. MicroRNA and Transcription Factor: Key Players in Plant Regulatory Network. Front Plant Sci. 2017;8:565. Epub 2017/04/28. pmid:28446918; PubMed Central PMCID: PMC5388764.
- 50. Barnhart MH, Masalia RR, Mosley LJ, Burke JM. Phenotypic and transcriptomic responses of cultivated sunflower seedlings (Helianthus annuus L.) to four abiotic stresses. PLoS One. 2022;17(9):e0275462. Epub 2022/10/01. pmid:36178944; PubMed Central PMCID: PMC9524668.
- 51. Schneiter A, Miller JJCS. Description of sunflower growth stages 1. 1981;21(6):901–3.
- 52. Masalia RR, Temme AA, Torralba NL, Burke JM. Multiple genomic regions influence root morphology and seedling growth in cultivated sunflower (Helianthus annuus L.) under well-watered and water-limited conditions. PLoS One. 2018;13(9):e0204279. Epub 2018/09/21. pmid:30235309; PubMed Central PMCID: PMC6147562.
- 53. Krieger E, Vriend G. YASARA View—molecular graphics for all devices—from smartphones to workstations. Bioinformatics. 2014;30(20):2981–2. Epub 2014/07/06. pmid:24996895; PubMed Central PMCID: PMC4184264.
- 54. Kyte J, Doolittle RF. A simple method for displaying the hydropathic character of a protein. J Mol Biol. 1982;157(1):105–32. Epub 1982/05/05. pmid:7108955.
- 55. Guruprasad K, Reddy BV, Pandit MW. Correlation between stability of a protein and its dipeptide composition: a novel approach for predicting in vivo stability of a protein from its primary sequence. Protein Eng. 1990;4(2):155–61. Epub 1990/12/01. pmid:2075190.
- 56. Zhang G, Zhang Z, Luo S, Li X, Lyu J, Liu Z, et al. Genome-wide identification and expression analysis of the cucumber PP2C gene family. BMC Genomics. 2022;23(1):563. Epub 2022/08/07. pmid:35933381; PubMed Central PMCID: PMC9356470.
- 57. Zhu QH, Helliwell CA. Regulation of flowering time and floral patterning by miR172. J Exp Bot. 2011;62(2):487–95. Epub 2010/10/19. pmid:20952628.
- 58. Lian H, Wang L, Ma N, Zhou CM, Han L, Zhang TQ, et al. Redundant and specific roles of individual MIR172 genes in plant development. PLoS Biol. 2021;19(2):e3001044. Epub 2021/02/03. pmid:33529193; PubMed Central PMCID: PMC7853526.
- 59. Aukerman MJ, Sakai H. Regulation of flowering time and floral organ identity by a MicroRNA and its APETALA2-like target genes. Plant Cell. 2003;15(11):2730–41. Epub 2003/10/14. pmid:14555699; PubMed Central PMCID: PMC280575.
- 60. Jung JH, Seo YH, Seo PJ, Reyes JL, Yun J, Chua NH, et al. The GIGANTEA-regulated microRNA172 mediates photoperiodic flowering independent of CONSTANS in Arabidopsis. Plant Cell. 2007;19(9):2736–48. Epub 2007/09/25. pmid:17890372; PubMed Central PMCID: PMC2048707.
- 61. Zhao L, Kim Y, Dinh TT, Chen X. miR172 regulates stem cell fate and defines the inner boundary of APETALA3 and PISTILLATA expression domain in Arabidopsis floral meristems. Plant J. 2007;51(5):840–9. Epub 2007/06/19. pmid:17573799; PubMed Central PMCID: PMC2629596.
- 62. Jerome Jeyakumar JM, Ali A, Wang WM, Thiruvengadam M. Characterizing the Role of the miR156-SPL Network in Plant Development and Stress Response. Plants (Basel). 2020;9(9). Epub 2020/09/19. pmid:32942558; PubMed Central PMCID: PMC7570127.
- 63. Cho SH, Coruh C, Axtell MJ. miR156 and miR390 regulate tasiRNA accumulation and developmental timing in Physcomitrella patens. Plant Cell. 2012;24(12):4837–49. Epub 2012/12/25. pmid:23263766; PubMed Central PMCID: PMC3556961.
- 64. Yao X, Chen J, Zhou J, Yu H, Ge C, Zhang M, et al. An Essential Role for miRNA167 in Maternal Control of Embryonic and Seed Development. Plant Physiol. 2019;180(1):453–64. Epub 2019/03/15. pmid:30867333; PubMed Central PMCID: PMC6501067.
- 65. Zhou L, Liu Y, Liu Z, Kong D, Duan M, Luo L. Genome-wide identification and analysis of drought-responsive microRNAs in Oryza sativa. J Exp Bot. 2010;61(15):4157–68. Epub 2010/08/24. pmid:20729483.
- 66. Inada S, Tominaga M, Shimmen T. Regulation of root growth by gibberellin in Lemna minor. Plant Cell Physiol. 2000;41(6):657–65. Epub 2000/08/17. pmid:10945334.
- 67. Rahman MS, Hossain MS, Saha SK, Rahman S, Sonne C, Kim KH. Homology Modeling and Probable Active Site Cavity Prediction of Uncharacterized Arsenate Reductase in Bacterial spp. Appl Biochem Biotechnol. 2021;193(1):1–18. Epub 2020/08/19. pmid:32809107.
- 68. Gupta CL, Akhtar S, Bajpai P. In silico protein modeling: possibilities and limitations. Excli j. 2014;13:513–5. Epub 2014/01/01. pmid:26417278; PubMed Central PMCID: PMC4467082.
- 69. Fiser A. Template-based protein structure modeling. Methods Mol Biol. 2010;673:73–94. Epub 2010/09/14. pmid:20835794; PubMed Central PMCID: PMC4108304.
- 70. Xu Z, Marowa P, Liu H, Du H, Zhang C, Li Y. Genome-Wide Identification and Analysis of P-Type Plasma Membrane H(+)-ATPase Sub-Gene Family in Sunflower and the Role of HHA4 and HHA11 in the Development of Salt Stress Resistance. Genes (Basel). 2020;11(4). Epub 2020/04/02. pmid:32230880; PubMed Central PMCID: PMC7231311.
- 71. Li J, Islam F, Huang Q, Wang J, Zhou W, Xu L, et al. Genome-wide characterization of WRKY gene family in Helianthus annuus L. and their expression profiles under biotic and abiotic stresses. PLoS One. 2020;15(12):e0241965. Epub 2020/12/04. pmid:33270651; PubMed Central PMCID: PMC7714227.
- 72. Neupane S, Schweitzer SE, Neupane A, Andersen EJ, Fennell A, Zhou R, et al. Identification and Characterization of Mitogen-Activated Protein Kinase (MAPK) Genes in Sunflower (Helianthus annuus L.). Plants (Basel). 2019;8(2). Epub 2019/01/27. pmid:30678298; PubMed Central PMCID: PMC6409774.
- 73. Shan F, Wu Y, Du R, Yang Q, Liu C, Wang Y, et al. Evolutionary analysis of the OSCA gene family in sunflower (Helianthus annuus L) and expression analysis under NaCl stress. 2023;11:e15089. pmid:37090105
- 74. Zhang C, Yang J, Meng W, Zeng L, Sun L. Genome-wide analysis of the WSD family in sunflower and functional identification of HaWSD9 involvement in wax ester biosynthesis and osmotic stress. Front Plant Sci. 2022;13:975853. Epub 2022/10/11. pmid:36212375; PubMed Central PMCID: PMC9539440.
- 75. Ma J, Ling L, Huang X, Wang W, Wang Y, Zhang M, et al. Genome-wide identification and expression analysis of the VQ gene family in sunflower (Helianthus annuus L.). 2021;30:56–66.
- 76. Song J, Shen W, Shaheen S, Li Y, Liu Z, Wang Z, et al. Genome‑wide identification and analysis of the trihelix transcription factors in sunflower. 2021;65:80–7.
- 77. Neupane S, Andersen EJ, Neupane A, Nepal MP. Genome-Wide Identification of NBS-Encoding Resistance Genes in Sunflower (Helianthus annuus L.). Genes. 2018;9(8). Epub 2018/08/01. pmid:30061549; PubMed Central PMCID: PMC6115920.
- 78. Li W, Zeng Y, Yin F, Wei R, Mao X. Genome-wide identification and comprehensive analysis of the NAC transcription factor family in sunflower during salt and drought stress. Sci Rep. 2021;11(1):19865. Epub 2021/10/08. pmid:34615898; PubMed Central PMCID: PMC8494813.
- 79. Fan K, Yuan S, Chen J, Chen Y, Li Z, Lin W, et al. Molecular evolution and lineage-specific expansion of the PP2C family in Zea mays. Planta. 2019;250(5):1521–38. Epub 2019/07/28. pmid:31346803.
- 80. Guo L, Lu S, Liu T, Nai G, Ren J, Gou H, et al. Genome-Wide Identification and Abiotic Stress Response Analysis of PP2C Gene Family in Woodland and Pineapple Strawberries. Int J Mol Sci. 2023;24(4). Epub 2023/02/26. pmid:36835472; PubMed Central PMCID: PMC9961684.
- 81. Nei M, Gu X, Sitnikova T. Evolution by the birth-and-death process in multigene families of the vertebrate immune system. Proc Natl Acad Sci U S A. 1997;94(15):7799–806. Epub 1997/07/22. pmid:9223266; PubMed Central PMCID: PMC33709.
- 82. Nei M, Rogozin IB, Piontkivska H. Purifying selection and birth-and-death evolution in the ubiquitin gene family. Proc Natl Acad Sci U S A. 2000;97(20):10866–71. Epub 2000/09/27. pmid:11005860; PubMed Central PMCID: PMC27115.
- 83. Nam J, dePamphilis CW, Ma H, Nei M. Antiquity and evolution of the MADS-box gene family controlling flower development in plants. Mol Biol Evol. 2003;20(9):1435–47. Epub 2003/06/05. pmid:12777513.
- 84. Long M, Betrán E, Thornton K, Wang W. The origin of new genes: glimpses from the young and old. Nat Rev Genet. 2003;4(11):865–75. Epub 2003/11/25. pmid:14634634.
- 85. Bondarenko VS, Gelfand MS. Evolution of the Exon-Intron Structure in Ciliate Genomes. PLoS One. 2016;11(9):e0161476. Epub 2016/09/08. pmid:27603699; PubMed Central PMCID: PMC5014332.
- 86. Shen X, Nan H, Jiang Y, Zhou Y, Pan X. Genome-Wide Identification, Expression and Interaction Analysis of GmSnRK2 and Type A PP2C Genes in Response to Abscisic Acid Treatment and Drought Stress in Soybean Plant. Int J Mol Sci. 2022;23(21). Epub 2022/11/12. pmid:36361951; PubMed Central PMCID: PMC9653956.
- 87. Wu XT, Xiong ZP, Chen KX, Zhao GR, Feng KR, Li XH, et al. Genome-Wide Identification and Transcriptional Expression Profiles of PP2C in the Barley (Hordeum vulgare L.) Pan-Genome. Genes (Basel). 2022;13(5). Epub 2022/05/29. pmid:35627219; PubMed Central PMCID: PMC9140614.
- 88. Ye Y, Ding Y, Jiang Q, Wang F, Sun J, Zhu C. The role of receptor-like protein kinases (RLKs) in abiotic stress response in plants. Plant Cell Rep. 2017;36(2):235–42. Epub 2016/12/10. pmid:27933379.
- 89. Wang CF, Han GL, Yang ZR, Li YX, Wang BS. Plant Salinity Sensors: Current Understanding and Future Directions. Front Plant Sci. 2022;13:859224. Epub 2022/04/26. pmid:35463402; PubMed Central PMCID: PMC9022007.
- 90. Cao J, Jiang M, Li P, Chu Z. Genome-wide identification and evolutionary analyses of the PP2C gene family with their expression profiling in response to multiple stresses in Brachypodium distachyon. BMC Genomics. 2016;17:175. Epub 2016/03/05. pmid:26935448; PubMed Central PMCID: PMC4776448.
- 91. Ding Z, Wang H, Liang X, Morris ER, Gallazzi F, Pandit S, et al. Phosphoprotein and phosphopeptide interactions with the FHA domain from Arabidopsis kinase-associated protein phosphatase. Biochemistry. 2007;46(10):2684–96. Epub 2007/02/17. pmid:17302430.
- 92. Wu Z, Luo L, Wan Y, Liu F. Genome-wide characterization of the PP2C gene family in peanut (Arachis hypogaea L.) and the identification of candidate genes involved in salinity-stress response. Front Plant Sci. 2023;14:1093913. Epub 2023/02/14. pmid:36778706; PubMed Central PMCID: PMC9911800.
- 93. Morgan CC, Loughran NB, Walsh TA, Harrison AJ, O’Connell MJ. Positive selection neighboring functionally essential sites and disease-implicated regions of mammalian reproductive proteins. BMC Evol Biol. 2010;10:39. Epub 2010/02/13. pmid:20149245; PubMed Central PMCID: PMC2830953.
- 94. Magadum S, Banerjee U, Murugan P, Gangapur D, Ravikesavan R. Gene duplication as a major force in evolution. J Genet. 2013;92(1):155–61. Epub 2013/05/04. pmid:23640422.
- 95. Panchy N, Lehti-Shiu M, Shiu SH. Evolution of Gene Duplication in Plants. Plant Physiol. 2016;171(4):2294–316. Epub 2016/06/12. pmid:27288366; PubMed Central PMCID: PMC4972278.
- 96. Cannon SB, Mitra A, Baumgarten A, Young ND, May G. The roles of segmental and tandem gene duplication in the evolution of large gene families in Arabidopsis thaliana. BMC Plant Biol. 2004;4:10. Epub 2004/06/03. pmid:15171794; PubMed Central PMCID: PMC446195.
- 97. Tran TC, Singleton C, Fraley TS, Greenwood JA. Cysteine-rich protein 1 (CRP1) regulates actin filament bundling. BMC Cell Biol. 2005;6:45. Epub 2005/12/13. pmid:16336664; PubMed Central PMCID: PMC1318456.
- 98. Eliasson A, Gass N, Mundel C, Baltz R, Kräuter R, Evrard JL, et al. Molecular and expression analysis of a LIM protein gene family from flowering plants. Mol Gen Genet. 2000;264(3):257–67. Epub 2000/11/21. pmid:11085265.
- 99.
Fernando VD, Schroeder DF. Role of ABA in Arabidopsis salt, drought, and desiccation tolerance. Abiotic and biotic stress in plants-recent advances and future perspectives: IntechOpen; 2016.
- 100. Maestrini P, Cavallini A, Rizzo M, Giordani T, Bernardi R, Durante M, et al. Isolation and expression analysis of low temperature-induced genes in white poplar (Populus alba). J Plant Physiol. 2009;166(14):1544–56. Epub 2009/05/26. pmid:19464753.
- 101. Mosharaf MP, Rahman H, Ahsan MA, Akond Z, Ahmed FF, Islam MM, et al. In silico identification and characterization of AGO, DCL and RDR gene families and their associated regulatory elements in sweet orange (Citrus sinensis L.). PLoS One. 2020;15(12):e0228233. Epub 2020/12/22. pmid:33347517; PubMed Central PMCID: PMC7751981.
- 102. Nishida J, Yoshida M, Arai K, Yokota T. Definition of a GC-rich motif as regulatory sequence of the human IL-3 gene: coordinate regulation of the IL-3 gene by CLE2/GC box of the GM-CSF gene in T cell activation. Int Immunol. 1991;3(3):245–54. Epub 1991/03/01. pmid:2049340.
- 103. Martin-Malpartida P, Batet M, Kaczmarska Z, Freier R, Gomes T, Aragón E, et al. Structural basis for genome wide recognition of 5-bp GC motifs by SMAD transcription factors. Nat Commun. 2017;8(1):2070. Epub 2017/12/14. pmid:29234012; PubMed Central PMCID: PMC5727232.
- 104. Liu J, Wang F, Yu G, Zhang X, Jia C, Qin J, et al. Functional Analysis of the Maize C-Repeat/DRE Motif-Binding Transcription Factor CBF3 Promoter in Response to Abiotic Stress. Int J Mol Sci. 2015;16(6):12131–46. Epub 2015/06/02. pmid:26030672; PubMed Central PMCID: PMC4490434.
- 105. Chen W, Provart NJ, Glazebrook J, Katagiri F, Chang HS, Eulgem T, et al. Expression profile matrix of Arabidopsis transcription factor genes suggests their putative functions in response to environmental stresses. Plant Cell. 2002;14(3):559–74. Epub 2002/03/23. pmid:11910004; PubMed Central PMCID: PMC150579.
- 106. Ai G, Li T, Zhu H, Dong X, Fu X, Xia C, et al. BPL3 binds the long non-coding RNA nalncFL7 to suppress FORKED-LIKE7 and modulate HAI1-mediated MPK3/6 dephosphorylation in plant immunity. Plant Cell. 2023;35(1):598–616. Epub 2022/10/22. pmid:36269178; PubMed Central PMCID: PMC9806616.
- 107. Carbone F, Bruno L, Perrotta G, Bitonti MB, Muzzalupo I, Chiappetta A. Identification of miRNAs involved in fruit ripening by deep sequencing of Olea europaea L. transcriptome. PLoS One. 2019;14(8):e0221460. Epub 2019/08/23. pmid:31437230; PubMed Central PMCID: PMC6705801.
- 108. Song QX, Liu YF, Hu XY, Zhang WK, Ma B, Chen SY, et al. Identification of miRNAs and their target genes in developing soybean seeds by deep sequencing. BMC Plant Biol. 2011;11:5. Epub 2011/01/12. pmid:21219599; PubMed Central PMCID: PMC3023735.
- 109. Su W, Raza A, Zeng L, Gao A, Lv Y, Ding X, et al. Genome-wide analysis and expression patterns of lipid phospholipid phospholipase gene family in Brassica napus L. BMC Genomics. 2021;22(1):548. Epub 2021/07/19. pmid:34273948; PubMed Central PMCID: PMC8286584.
- 110. Wen Y, Raza A, Chu W, Zou X, Cheng H, Hu Q, et al. Comprehensive In Silico Characterization and Expression Profiling of TCP Gene Family in Rapeseed. Front Genet. 2021;12:794297. Epub 2021/12/07. pmid:34868279; PubMed Central PMCID: PMC8635964.
- 111. Zhao C, Xia H, Cao T, Yang Y, Zhao S, Hou L, et al. Small RNA and degradome deep sequencing reveals peanut microRNA roles in response to pathogen infection. Plant molecular biology reporter. 2015;33:1013–29.
- 112. Aravind J, Rinku S, Pooja B, Shikha M, Kaliyugam S, Mallikarjuna MG, et al. Identification, characterization, and functional validation of drought-responsive microRNAs in subtropical maize inbreds. Frontiers in plant science. 2017;8:941. pmid:28626466
- 113. Lee H, Yoo SJ, Lee JH, Kim W, Yoo SK, Fitzgerald H, et al. Genetic framework for flowering-time regulation by ambient temperature-responsive miRNAs in Arabidopsis. Nucleic Acids Res. 2010;38(9):3081–93. Epub 2010/01/30. pmid:20110261; PubMed Central PMCID: PMC2875011.
- 114. Xin M, Wang Y, Yao Y, Xie C, Peng H, Ni Z, et al. Diverse set of microRNAs are responsive to powdery mildew infection and heat stress in wheat (Triticum aestivum L.). BMC Plant Biol. 2010;10:123. Epub 2010/06/25. pmid:20573268; PubMed Central PMCID: PMC3095282.
- 115.
We H, editor Cloning and Expression Analysis of TaPP2C59 Gene in Wheat2014.
- 116. Liu L, Hu X, Song J, Zong X, Li D, Li D. Over-expression of a Zea mays L. protein phosphatase 2C gene (ZmPP2C) in Arabidopsis thaliana decreases tolerance to salt and drought. J Plant Physiol. 2009;166(5):531–42. Epub 2008/10/22. pmid:18930563.
- 117. Liu X, Zhu Y, Zhai H, Cai H, Ji W, Luo X, et al. AtPP2CG1, a protein phosphatase 2C, positively regulates salt tolerance of Arabidopsis in abscisic acid-dependent manner. Biochem Biophys Res Commun. 2012;422(4):710–5. Epub 2012/05/26. pmid:22627139.
- 118. Haider MS, Kurjogi MM, Khalil-ur-Rehman M, Pervez T, Songtao J, Fiaz M, et al. Drought stress revealed physiological, biochemical and gene-expressional variations in ‘Yoshihime’peach (Prunus Persica L) cultivar. 2018;13(1):83–90.
- 119. Fu H, Yu X, Jiang Y, Wang Y, Yang Y, Chen S, et al. SALT OVERLY SENSITIVE 1 is inhibited by clade D Protein phosphatase 2C D6 and D7 in Arabidopsis thaliana. Plant Cell. 2023;35(1):279–97. Epub 2022/09/24. pmid:36149299; PubMed Central PMCID: PMC9806586.
- 120. Chen J, Zhang D, Zhang C, Xia X, Yin W, Tian Q. A Putative PP2C-Encoding Gene Negatively Regulates ABA Signaling in Populus euphratica. PLoS One. 2015;10(10):e0139466. Epub 2015/10/03. pmid:26431530; PubMed Central PMCID: PMC4592019.
- 121. He H, Lu Z, Ma Z, Liang G, Ma L, Wan P, et al. Genome-wide identification and expression analysis of the PP2C gene family in Vitis vinifera. 2018;45(7):1237–50.
- 122. Sailapathi A, Gunalan S, Somarathinam K, Kothandan G, Kumar DJHMMPA. Importance of homology modeling for predicting the structures of GPCRs. 2021.
- 123. Leung J, Bouvier-Durand M, Morris PC, Guerrier D, Chefdor F, Giraudat J. Arabidopsis ABA response gene ABI1: features of a calcium-modulated protein phosphatase. Science. 1994;264(5164):1448–52. Epub 1994/06/03. pmid:7910981.
- 124. Mitula F, Tajdel M, Cieśla A, Kasprowicz-Maluśki A, Kulik A, Babula-Skowrońska D, et al. Arabidopsis ABA-Activated Kinase MAPKKK18 is Regulated by Protein Phosphatase 2C ABI1 and the Ubiquitin-Proteasome Pathway. Plant Cell Physiol. 2015;56(12):2351–67. Epub 2015/10/08. pmid:26443375; PubMed Central PMCID: PMC4675898.
- 125. Mann DJ, Campbell DG, McGowan CH, Cohen PT. Mammalian protein serine/threonine phosphatase 2C: cDNA cloning and comparative analysis of amino acid sequences. Biochim Biophys Acta. 1992;1130(1):100–4. Epub 1992/02/28. pmid:1311954.
- 126. Tähtiharju S, Palva T. Antisense inhibition of protein phosphatase 2C accelerates cold acclimation in Arabidopsis thaliana. Plant J. 2001;26(4):461–70. Epub 2001/07/06. pmid:11439132.
- 127. Singh A, Jha SK, Bagri J, Pandey GK. ABA inducible rice protein phosphatase 2C confers ABA insensitivity and abiotic stress tolerance in Arabidopsis. PLoS One. 2015;10(4):e0125168. Epub 2015/04/18. pmid:25886365; PubMed Central PMCID: PMC4401787.
- 128. Fagerberg L, Hallström BM, Oksvold P, Kampf C, Djureinovic D, Odeberg J, et al. Analysis of the human tissue-specific expression by genome-wide integration of transcriptomics and antibody-based proteomics. Mol Cell Proteomics. 2014;13(2):397–406. Epub 2013/12/07. pmid:24309898; PubMed Central PMCID: PMC3916642.