Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Genomic Adaptation of the Lactobacillus casei Group

  • Hidehiro Toh ,

    Contributed equally to this work with: Hidehiro Toh, Kenshiro Oshima

    Affiliation Medical Institute of Bioregulation, Kyushu University, Higashi-ku, Fukuoka, Japan

  • Kenshiro Oshima ,

    Contributed equally to this work with: Hidehiro Toh, Kenshiro Oshima

    Affiliation Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa, Chiba, Japan

  • Akiyo Nakano,

    Affiliation School of Veterinary Medicine, Azabu University, Sagamihara, Kanagawa, Japan

  • Muneaki Takahata,

    Affiliation School of Veterinary Medicine, Azabu University, Sagamihara, Kanagawa, Japan

  • Masaru Murakami,

    Affiliation School of Veterinary Medicine, Azabu University, Sagamihara, Kanagawa, Japan

  • Takashi Takaki,

    Affiliation JEOL Ltd., Akishima, Tokyo, Japan

  • Hidetoshi Nishiyama,

    Affiliation JEOL Ltd., Akishima, Tokyo, Japan

  • Shizunobu Igimi,

    Affiliation Division of Biomedical Food Research, National Institute of Health Sciences, Kamiyoga, Setagaya, Tokyo, Japan

  • Masahira Hattori,

    Affiliation Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa, Chiba, Japan

  • Hidetoshi Morita

    morita@azabu-u.ac.jp

    Affiliation School of Veterinary Medicine, Azabu University, Sagamihara, Kanagawa, Japan

Abstract

Lactobacillus casei, L. paracasei, and L. rhamnosus form a closely related taxonomic group (Lactobacillus casei group) within the facultatively heterofermentative lactobacilli. Here, we report the complete genome sequences of L. paracasei JCM 8130 and L. casei ATCC 393, and the draft genome sequence of L. paracasei COM0101, all of which were isolated from daily products. Furthermore, we re-annotated the genome of L. rhamnosus ATCC 53103 (also known as L. rhamnosus GG), which we have previously reported. We confirmed that ATCC 393 is distinct from other strains previously described as L. paracasei. The core genome of 10 completely sequenced strains of the L. casei group comprised 1,682 protein-coding genes. Although extensive genome-wide synteny was found among the L. casei group, the genomes of ATCC 53103, JCM 8130, and ATCC 393 contained genomic islands compared with L. paracasei ATCC 334. Several genomic islands, including carbohydrate utilization gene clusters, were found at the same loci in the chromosomes of the L. casei group. The spaCBA pilus gene cluster, which was first identified in GG, was also found in other strains of the L. casei group, but several L. paracasei strains including COM0101 contained truncated spaC gene. ATCC 53103 encoded a higher number of proteins involved in carbohydrate utilization compared with intestinal lactobacilli, and extracellular adhesion proteins, several of which are absent in other strains of the L. casei group. In addition to previously fully sequenced L. rhamnosus and L. paracasei strains, the complete genome sequences of L. casei will provide valuable insights into the evolution of the L. casei group.

Introduction

The genus Lactobacillus is the largest group of the family Lactobacteriaceae and contains more than 130 species. The species Lactobacillus casei, L. paracasei, and L. rhamnosus are phylogenetically and phenotypically closely related and are regarded together as the Lactobacillus casei group within the facultatively heterofermentative lactobacilli [1]. The classification and nomenclature of this group are controversial [2][7]. Some strains of L. casei, L. paracasei, and L. rhamnosus have for long been used as probiotics in a wide range of different products marketed in many countries. L. casei and L. paracasei have also been isolated from a variety of environmental habitats, including raw and fermented dairy (especially cheese) and plant materials (e.g., wine, pickle, silage, and kimchi). They are used as acid-producing starter cultures in milk fermentation as adjunct cultures for intensification and for acceleration of flavor development in bacterial-ripened cheeses. They are commonly the dominant species of nonstarter lactic acid bacteria in ripening cheese.

In the L. casei group, the genomes of five L. paracasei strains (ATCC 334, BD-II, BL23, LC2W, and Zhang) and three L. rhamnosus strains (ATCC 53103, Lc 705, and ATCC 8530) have been fully sequenced to date [8][14]. We have also previously reported the complete genome sequence of L. rhamnosus ATCC 53103 [15]. L. rhamnosus GG, the original strain of L. rhamnosus ATCC 53103, was isolated from a healthy human intestinal flora, and is one of the most widely used and well-documented probiotics, which confer a health benefit on the host when administered in adequate amounts [16]. It has been reported that L. rhamnosus GG can shorten the duration of infectious diarrhea, reduce antibiotic-associated symptoms, and alleviate food allergy and atopic dermatitis in children [16].

In this paper, we present the complete genome sequences of L. casei ATCC 393 and L. paracasei JCM 8130 (also known as ATCC 25302), which were isolated from a cheese and milk product, respectively, and the draft genome sequence of L. paracasei COM0101 isolated from a commercial fermented milk product. Furthermore, we re-annotated the genome of L. rhamnosus ATCC 53103. We then compared sequenced genomes of the L. casei group to gain a broader view of the genetic variability within the group. Comparison of the genome sequences of strains isolated from the human gut and dairy products can provide valuable insights into the lifestyle adaptation of the L. casei group.

Materials and Methods

Genome Sequencing

L. paracasei JCM 8130 and L. casei ATCC 393 were obtained from the Japan Collection of Microorganisms (JCM) and the American Type Culture Collection (ATCC), respectively. In this study, ten strains of putative L. paracasei isolated from the fermented milk product Yakult (Yakult Ltd., Japan) exhibited the same pattern by random amplification of polymorphic DNA fingerprinting [17]. We thus selected one L. paracasei strain designated as COM0101 for sequencing. L. paracasei JCM 8130, L. casei ATCC 393, and L. paracasei COM0101 were cultured in MRS (deMan, Rogosa and Sharpe) broth (Difco) at 37°C for 24 h, and the genomic DNAs were isolated and purified as previously described [18].

The genome sequences of L. paracasei JCM 8130, L. casei ATCC 393, and L. paracasei COM0101 were determined by the whole-genome shotgun strategy using Sanger sequencing (3730xl DNA sequencers) and 454 pyrosequencing (GS-FLX sequencers). We generated 19,200 (3.9-fold, 3730xl) and 284,003 (25.7-fold, GS-FLX) sequences from the L. paracasei JCM 8130 genome, 28,416 (5.9-fold, 3730xl) sequences from the L. casei ATCC 393 genome, and 131,707 (15.4-fold, GS-FLX) sequences from the L. paracasei COM0101 genome. The 454 pyrosequencing reads were assembled using the Newbler assembler software. A hybrid assembly of 454 and Sanger reads was performed using the Phred-Phrap-Consed program. Gap closing and re-sequencing of low-quality regions were conducted by Sanger sequencing to obtain the high-quality finished sequence. The overall accuracy of the finished sequence was estimated to have an error rate of <1 per 10,000 bases (Phrap score of ≥40). The deep sequencing dataset of L. paracasei JCM 8130 and L. paracasei COM0101 are deposited in the DDBJ/GenBank/EMBL Sequence Read Archive under the accession numbers DRA000955 and DRA000956, respectively.

Informatics

An initial set of predicted protein-coding genes was identified using Glimmer 3.0 [19]. Genes consisting of <120 base pairs (bp) and those containing overlaps were eliminated. All predicted proteins were searched against a non-redundant protein database (nr, NCBI) using BLASTP with a bit-score cutoff of 60. The start codon of each protein-coding gene was manually refined from BLASTP alignments. The tRNA genes were predicted by the tRNAscan-SE [20], and the rRNA genes were detected by BLASTN search using known Lactobacillus rRNA sequences as queries. Protein domains were identified using HMMER with the Pfam database. Orthology across whole genomes has been determined using BLASTP reciprocal best hits in all-against-all comparisons of amino acid sequences. Two sequences were identified as highly conserved orthologs if their BLAST score ratio is more than 0.8. When two genome sequences were compared using BLASTN, non-matching regions were predicted as genomic islands. The presence of an N-terminal signal peptide sequence was predicted using the SignalP [21]. Clustered regularly interspaced short palindromic repeats (CRISPR) were predicted using the CRISPRFinder [22]. Draft genome sequences of L. rhamnosus ATCC 21052 (accession no. AFZY01000000), L. rhamnosus HN001 (ABWJ00000000), L. rhamnosus LMS2-1 (ACIZ00000000), L. paracasei 8700∶2 (ABQV00000000), and L. casei (zeae) KCTC 3804 (BACQ01000000) were obtained from GenBank.

The complete genome sequences of L. paracasei JCM 8130, L. casei ATCC 393, and L. rhamnosus ATCC 53103 are deposited in the DDBJ/GenBank/EMBL database under the accession numbers AP012541–AP012543, AP012544–AP012546, and AP011548, respectively. The draft genome sequence of COM0101 has been deposited in public database under the accession numbers BAGT01000001–BAGT01000184.

Results and Discussion

Comparative Genome Analysis within the L. casei Group

We first re-annotated the genome of L. rhamnosus ATCC 53103, which we previously reported in the short paper [15]. Next, we determined and annotated the complete genome sequences of L. paracasei JCM 8130 and L. casei ATCC 393. The genome of L. paracasei JCM 8130 consists of a circular chromosome of 2,995,875 bp and two plasmids, and that of L. casei ATCC 393 consists of a circular chromosome of 2,924,929 bp and two plasmids (Fig. 1). The chromosomes of L. paracasei JCM 8130 and L. casei ATCC 393 contained 2,848 and 2,737 predicted protein-coding genes, respectively. The larger plasmid (27 kilobases [kb]) of ATCC 393 shared 14 genes, such as beta-galactosidase and cystathionine beta-synthase, with a 65-kb plasmid (accession no. FM179324) of L. rhamnosus Lc 705 (Fig. S1), thus indicating that both plasmids may be derived from the same origin. Furthermore, we generated a draft genome sequence of L. paracasei COM0101 that consists of 184 contigs (>500 bp) with a total length of 3,003,364 bp. The COM0101 genome contained 2,767 predicted protein-coding genes. One of the highly redundant contigs contained a gene for plasmid replication protein that showed 100% amino acid identity with that of L. paracasei strains, indicating that the COM0101 genome probably has at least one plasmid. Their chromosome sizes (2.9–3.0 megabases [Mb]) were among the largest group in the Lactobacillus genomes, with an average size of 1.8–2.0 Mb (Fig. 2A). General features of these genomes are summarized in Table S1.

thumbnail
Figure 1. Circular representations of the chromosomes of L. rhamnosus ATCC 53103, L. paracasei JCM 8130, and L. casei ATCC 393.

From the outside: circles 1 and 2 of the chromosome show the positions of protein-coding genes on the positive and negative strands, respectively. Circle 3 shows the positions of protein-coding genes that are shared among the 10 completely sequenced genomes of the L. casei group. Circle 4 shows the positions of tRNA genes (orange) and rRNA genes (blue). Circle 5 shows a plot of GC skew [(G − C)/(G+C); orange indicates values >0; blue indicates values <0]. Circle 6 shows a plot of G+C content (outward: higher values than the average). The genomic islands in each strain are boxed: regions including carbohydrate utilization gene cluster (pink), prophage-like regions (green), and the others (blue).

https://doi.org/10.1371/journal.pone.0075073.g001

thumbnail
Figure 2. Genome-based phylogenetic analysis of the L. casei group.

(A) Phylogenetic relationships between the genomes of sequenced lactobacilli inferred from 34 concatenated ribosomal protein amino acid sequences. The scale bar represents an evolutionary distance. Sequences were aligned with ClustalW with a bootstrap trial of 1,000 and bootstrap values (%) are indicated at the nodes. An unrooted tree was generated using NJplot. The chromosome size is shown in parentheses. (B) Three-way comparisons between L. casei ATCC 393 with L. rhamnosus ATCC 53103 and L. paracasei ATCC 334. The 2,191 genes shared by the three strains were classified into three categories on the basis of the BLAST score ratio analysis [23]. (C) Venn diagram comparing the gene inventories of four strains of the L. casei group. Data resulted from reciprocal BLASTP analysis. The numbers of shared and unique genes are shown.

https://doi.org/10.1371/journal.pone.0075073.g002

We constructed a phylogenetic tree for concatenated sequences of ribosomal proteins from sequenced Lactobacillus (Fig. 2A). L. casei ATCC 393 and the L. caseiparacasei phylum were found to form a distinct clade within the L. casei group, and L. casei ATCC 393 was shown to be closer to L. casei (zeae) KCTC 3804. A three-way comparison between the genomes of L. casei ATCC 393, L. rhamnosus ATCC 53103, and L. paracasei ATCC 334 using the BLAST score ratio analysis [23] revealed a greater number of proteins in L. casei ATCC 393 showing a high score for L. rhamnosus ATCC 53103 than those showing a high score for L. paracasei ATCC 334 (Fig. 2B). Moreover, L. casei ATCC 393 shared more genes with L. rhamnosus ATCC 53103 than with L. paracasei ATCC 334 (Fig. 2C). We thus found that L. casei ATCC 393 is more closely related to L. rhamnosus strains than to L. paracasei strains based on the phylogeny, overall protein similarities, and number of shared genes. This result supports the previous reports that L. casei ATCC 393 is distinct from other strains previously described as L. paracasei [2], [3], [5], [6]. Furthermore, we also constructed a multi-locus sequence typing (MLST)-based phylogenetic tree [24] for L. paracasei strains (Fig. S2A), showing that COM0101 shares the same MLST lineage with BL23, LC2W, and BD-II. Moreover, COM0101 shared more genes with BL23 than with ATCC 334 and JCM 8130 (Fig. S2B). Thus, COM0101 is phylogenetically closely related to BL23, LC2W, and BD-II in L. paracasei strains.

We compared the genomes of L. rhamnosus ATCC 53103, L. paracasei JCM 8130, L. casei ATCC 393, and L. paracasei ATCC 334 (Fig. 2C). Thus, 1,793 genes were common to the four strains, and a total of 4,315 ortholog clusters were assigned to the pan-genome of the four strains. Of the 1,793 core genes, 1,682 (94%) were also conserved among the other six completely sequenced strains (BD-II, BL23, LC2W, Zhang, Lc 705, and ATCC 8530) of the L. casei group. Broadbent et al. (2012) showed that 1,715 protein-coding genes were common to 17 sequenced L. casei strains [25]. These results suggest that approximately 1,700 genes constitute the core genome of the L. casei group, likely inherited from their common ancestor. All dispensable protein-coding genes, which were found in one or more but not all the 10 completely sequenced strains of the L. casei group, were functionally classified based on the clusters of orthologous groups from the NCBI COGs database, and the gene repertoires were compared (Fig. S3). There was a considerable difference in the number of genes assigned to COG category G (carbohydrate transport and metabolism) and category L (replication, recombination, and repair) among the strains. L. rhamnosus strains had a lower number of genes assigned to COG category L because the L. rhamnosus genomes contained a lower number of transposase genes compared with the other strains, suggesting that insertion element-mediated genome diversification is less frequent in L. rhamnosus strains. In contrast, L. paracasei JCM 8130 and L. casei ATCC 393 contained a higher number of transposase genes. Most of the genes assigned to COG category G were encoded in hypervariable regions in the genomes of the L. casei group (described later). We next classified all protein-coding genes of L. rhamnosus ATCC 53103 and sequenced intestinal lactobacilli on the basis of the COGs database (Fig. 3A). L. rhamnosus ATCC 53103 contained a higher number of genes assigned to COG category G compared with intestinal lactobacilli. The abundance of genes related to carbohydrate transport and metabolism in L. rhamnosus ATCC 53103 may contribute to the wide variety of qualities in this strain compared with other probiotics.

thumbnail
Figure 3. Abundance of genes related to carbohydrate transport and metabolism in L. rhamnosus ATCC 53103.

(A) Comparative analysis by functional categories of the gene repertoires of sequenced intestinal lactobacilli. The number of genes assigned to COG category G in each genome is shown. (B) Carbohydrate utilization gene clusters of L. rhamnosus ATCC 53103. Genes and their orientations are depicted with arrows. Regions -5 and -6 are compared with the corresponding genomic locations in L. rhamnosus Lc 705. Gray bars indicate orthologous regions.

https://doi.org/10.1371/journal.pone.0075073.g003

Bacteriocins are small antimicrobial peptides produced widely by lactic acid bacteria. The L. rhamnosus ATCC 53103 genome encoded the bacteriocin gene cluster (LRHM_2289 to LRHM_2312), which contained genes encoding the two-component sensor and regulator, four bacteriocin immunity proteins, ATP-binding cassette (ABC) transporter with the proteolytic domain, and small peptides. The cluster was conserved in the genomes of the L. casei group, but in the corresponding region of L. casei ATCC 393, a gene for bacteriocin ABC transporter was interrupted by transposase (LBCZ_2129 to LBCZ_2133) and genes for immunity proteins were absent, suggesting that L. casei ATCC 393 may not be able to produce bacteriocin.

CRISPRs, along with their associated cas genes, are known to constitute a defense system against the propagation of phages and plasmids; these were observed in the genomes of a number of lactic acid bacteria [26]. L. rhamnosus ATCC 53103 contained a CRISPR region (2,260,261–2,261,880) and four CRISPR-associated genes (LRGG_2116 to LRGG_2119). The 36-bp-long sequence was present 25 times and separated by 30-bp unique spacer sequences. It has been reported that two distinct types (Lsal1 and Ldbu1) of CRISPR loci were identified in the L. casei genomes [25]. L. casei strains BD-II, BL23, LC2W, and Zhang also have an Lsal1-type CRISPR region at the same locus on the chromosome, suggesting that the ancestral strain of the L. casei group had encoded a CRISPR region. However, the 36-bp repeat sequence of the four L. casei strains differs by two bases from that of L. rhamnosus ATCC 53103, and the number of the repeat sequences was different (17–22) among these strains. COM0101 has the orthologs of the four CRISPR-associated genes, indicating that COM0101 also may have a CRISPR region. In contrast, L. paracasei JCM 8130, L. casei ATCC 393, L. rhamnosus Lc 705, and L. rhamnosus ATCC 8530 had no CRISPR, suggesting that these strains may have lost a CRISPR region during adaptation to their environment where phage detection is not essential.

Genomic Islands

Whole-genome alignment showed a high level of synteny among the strains of the L. casei group (Fig. S4). A previous report showed that there was a high degree of synteny among the genomes of 17 L. casei strains [25]. These results indicate that strains of the L. casei group have a stable genome structure. However, each genome contained specific genes, many of which were grouped in clusters as genomic islands (GIs). It has been reported that the comparison of the genomes of L. paracasei ATCC 334 and BL23 revealed 12 and 19 GIs (>5 kb) in ATCC 334 and BL23, respectively [27]. Similarly, we identified 26 GIs (>5 kb) in L. rhamnosus ATCC 53103 that were not conserved in L. paracasei ATCC 334 (a cheese isolate) (Table 1, Fig. 1). The 26 genomic islands of L. rhamnosus ATCC 53103 included six carbohydrate utilization gene clusters (regions −1 to −6), four of which were completely or partially present in L. paracasei BL23, whose ecological origin is unclear. This result supports the previous findings that cheese isolates, including L. paracasei ATCC 334, have undergone significant gene decay, including loss of many genes involved in carbohydrate utilization [25], [27]. Thus, L. paracasei ATCC 334 contains a lower number of genes related to carbohydrate transport and metabolism compared with the other sequenced L. paracasei strains (Fig. S3). In probiotic lactobacilli, horizontal gene transfer played an important role in shaping the common ancestor [28]. Such acquisition of new genes can expand a bacterium’s potential for adaptation to a new niche. The common ancestor of L. rhamnosus ATCC 53103 and L. paracasei ATCC 334 seems to have acquired carbohydrate utilization gene clusters via lateral gene transfer. These carbohydrate utilization gene clusters may have provided adaptive features to some strains including ATCC 53103 for their survival and proliferation in the human intestine. In contrast, these carbohydrate utilization gene clusters may have been lost in the lineage to ATCC 334 during its adaptation to the cheese environment.

thumbnail
Table 1. Genomic islands in L. rhamnosus ATCC 53103, L. paracasei JCM 8130, and L. casei ATCC 393.

https://doi.org/10.1371/journal.pone.0075073.t001

Similarly, compared with L. paracasei ATCC 334, 15 and 24 GIs were found in L. paracasei JCM 8130 and L. casei ATCC 393, respectively (Table 1, Fig. 1). Of these GIs, 6 (JCM 8130) and 10 (ATCC 393) were found at the same loci with those of L. rhamnosus ATCC 53103. A comparative genome hybridization in 22 L. casei strains isolated from various habitats has revealed 25 hypervariable regions [27], of which 11 were found at the same loci of the GIs in L. rhamnosus ATCC 53103. Thus, these results suggest that the chromosomes of the L. casei group contain several hypervariable regions at the same loci.

The six carbohydrate utilization gene clusters of L. rhamnosus ATCC 53103 contained the genes for phosphoenolpyruvate-carbohydrate phosphotransferase (PTS)-type transporter systems, glycosyl hydrolases, transcriptional regulators, and other carbohydrate-related proteins (Fig. 3B). L. rhamnosus ATCC 53103 encoded 28 complete PTS-type transporter systems, 11 of which were encoded adjacent to genes for glycosyl hydrolase and transcriptional regulator, thereby allowing localized transcriptional control. The organization (carbohydrate transporter, glycosyl hydrolase, and transcriptional regulator) is reminiscent of the many clusters found in Bifidobacterium longum [29].

Six of the 26 GIs of L. rhamnosus ATCC 53103 overlapped with all the hypervariable regions among the sequenced L. rhamnosus strains (ATCC 53103, Lc 705, ATCC 8530, ATCC 2105, HN001, and LMS2-1). Three of the six hypervariable regions were prophage-like regions (LRHM_1038 to LRHM_1090, LRHM_1455 to LRHM_1475, and LRHM_2779 to LRHM_2794 in ATCC 53103). The other three regions corresponded to regions containing carbohydrate utilization gene clusters (regions -3, -5, and -6), indicating that L. rhamnosus strains show flexibility in sugar utilization. Two of the five PTS-type transporter systems in region-5 and two in region-6 were missing in Lc 705, ATCC 8530, and LMS2-1 strains (Fig. 3B). Comparative genomic hybridization analyses have showed that the region corresponding to regions -5 and -6 contains an overrepresentation of genes involved in carbohydrate utilization and transcriptional regulation in 22 L. casei strains [27]. Taken together, the region corresponding to regions -5 and -6 in the genomes of the L. casei group may be required to fine-tune its ability to utilize carbohydrates.

Extracellular Components

Another group has also determined the complete genome sequence of L. rhamnosus GG, and revealed the presence of the SpaCBA pili on the cell surface of L. rhamnosus GG [9]. SpaA is a backbone-forming major pilin, SpaB is a minor pilin, and SpaC located at the pilus tip is essential for the mucus adherence of L. rhamnosus GG [9], [30]. The spaCBA genes are encoded in the largest GI (LRHM_0376 to LRHM_0466) in L. rhamnosus ATCC 53103 (Fig. S5). The L. paracasei Zhang, L. paracasei BL23, and L. paracasei ATCC 334 genomes also encode the spaCBA genes (Fig. S5). In contrast, L. casei ATCC 393 completely lacks the spaCBA genes. The spaCBA genes were also encoded in L. paracasei COM0101, but the spaC gene was truncated by a nonsense mutation [25] (Fig. S5), which probably encodes a non-functional protein. Douillard et al., (2013) clearly showed that the L. paracasei strain isolated from Yakult produced no pilus structures by an immunoelectron microscopy using immunogold staining [31]. It has been reported that the adhesion capacity of L. rhamnosus GG to Caco-2 cells and intestinal mucus was approximately 10 times that of strain Shirota, which was obtained from Yakult [32]. This may be because L. rhamnosus GG encodes the intact SpaCBA and L. paracasei COM0101 encodes truncated SpaC. Furthermore, L. paracasei JCM 8130, L. paracasei BD-II, and L. paracasei LC2W also contained truncated spaC gene (Fig. S5), and L. rhamnosus Lc 705 and ATCC 8530 completely lacked the spaCBA genes. The spaCBA genes have been found only in the L. casei group to date. Because different lineages in L. casei strains contained the spaCBA genes, it has been suggested that the spaCBA genes were not recently acquired [25]. It could thus be speculated that the ancestral strain of the L. casei group had encoded the intact spaCBA genes and then spaCBA may have been lost or disrupted in certain strains of the L. casei group.

L. rhamnosus ATCC 53103 had three gene clusters encoding proteins with a C-terminal WxL domain (Fig. 4A). The WxL domain is conserved in the surface proteins in low-GC gram-positive bacteria [33] and attaches to the peptidoglycan on the cell surface [34]. The WxL protein cluster was not found in other sequenced intestinal lactobacilli. The proteins with the WxL domain were present together with the proteins containing the DUF916 domain (PF06030) of unknown function and the small proteins with the LPXTG-like sorting motif, and their gene organizations were similar to that in L. plantarum WCFS1 [35]. Of the three WxL protein clusters, one (LRHM_1699 to LRHM_1702) was not conserved in the sequenced L. paracasei strains (Fig. 4A, Table 2). There were 14 genes encoding proteins that had both a signal sequence for secretion and an LPXTG-type motif for covalent anchoring to the peptidoglycan matrix (Table 2), and these proteins can be cleaved by sortase. The protein LRHM_1529 was composed of 3,275 amino acid residues, representing the largest protein in this genome, and it contained imperfect repeats consisting of serine, alanine, and aspartic acid. This serine-rich motif has been found in the extracellular proteins in the genomes of other gram-positive bacteria such as L. plantarum, L. johnsonii, and Streptococcus pneumoniae [29], [36], [37]. The protein LRHM_1529 was encoded in the region (LRHM_1518 to LRHM_1530), which contained two glycosyltransferase genes (Fig. 4B). It has been suggested that glycosyltransferase, encoded by the adjacent genes, caused O-linked glycosylations on the serines in the putative cell surface protein, thus producing mucin-like structures [36]. Similarly, the protein LRHM_2193 had an LPXTG-type motif, and it contained imperfect repeats consisting of serine and alanine and two adjacent glycosyltransferase genes (Fig. 4B). Thus, LRHM_1529 and LRHM_2193 could encode glycosylated cell-surface adhesives. The protein LRHM_1797 (2,357 amino acids) plays an important modulating role in adhesion to intestinal epithelial cells and biofilm formation [38]. These genes (LRHM_1529, LRHM_1797, and LRHM_2193) were absent in the sequenced L. paracasei strains. The presence of a variety of the cell surface adherence proteins could contribute to the probiotic properties of L. rhamnosus ATCC 53103.

thumbnail
Figure 4. Gene clusters encoding cell surface proteins in L. rhamnosus ATCC 53103.

(A) WxL clusters. (B) Putative glycosylated cell-surface protein clusters. Genes and their orientations are depicted with arrows. Gray bars indicate orthologous regions between L. rhamnosus ATCC 53103 and L. paracasei ATCC 334.

https://doi.org/10.1371/journal.pone.0075073.g004

thumbnail
Table 2. Putative cell surface adherence proteins of L. rhamnosus ATCC 53103.

https://doi.org/10.1371/journal.pone.0075073.t002

Conclusions

We determined the complete genome sequences of L. paracasei JCM 8130 and L. casei ATCC 393, and the draft genome sequence of L. paracasei COM0101. Furthermore, we re-annotated the genome of L. rhamnosus ATCC 53103. We confirmed that L. casei ATCC 393 is distinct from the L. paracasei strains previously. Comparative genome analysis revealed 1,682 core genes and genome-wide synteny in the L. casei group. Chromosomes of the L. casei group contained GIs, many of which are also found at the same loci, suggesting that the chromosomes of the L. casei group contain several hypervariable regions at the same loci, which may contribute to the adaptation to each ecological niche. The spaCBA pilus gene cluster, which was first identified in L. rhamnosus GG, was also found in other strains of the L. casei group, but several L. paracasei strains including COM0101 contained truncated spaC gene. L. rhamnosus ATCC 53103 encodes SpaCBA pili, proteins with WxL domain, two glycosylated cell-surface adhesives, and several large proteins with the LPXTG motif. The complete genome sequences of L. rhamnosus, L. paracasei, and L. casei will provide a framework that will help understand the genomic differences between strains within the L. casei group.

Supporting Information

Figure S1.

Linear representations of the plasmids of L. casei 393 and of L. rhamnosus Lc 705. Genes and their orientations are depicted with arrows. Several lines connect orthologs with the following colors: red, genes sharing over 95% amino acid identity; orange, genes sharing 70–95% amino acid identity; blue, transposase genes; and green, partially conserved genes.

https://doi.org/10.1371/journal.pone.0075073.s001

(EPS)

Figure S2.

Genetic relationships among L. paracasei strains as defined by multilocus sequence typing. (A) Concatenated sequences of five MLST loci (ftsZ, metRS, mutL, pgm, and polA) were analyzed as described previously [24]. (B) Venn diagram comparing the gene inventories of four L. paracasei strains. Data resulted from reciprocal BLASTP analysis. The numbers of shared and unique genes are shown.

https://doi.org/10.1371/journal.pone.0075073.s002

(EPS)

Figure S3.

COG classification of dispensable protein-coding genes of the L. casei group.

https://doi.org/10.1371/journal.pone.0075073.s003

(EPS)

Figure S4.

Synteny between the chromosomes in the L. casei group. Each plot point represents reciprocal best matches by BLASTP comparisons between orthologs.

https://doi.org/10.1371/journal.pone.0075073.s004

(EPS)

Figure S5.

The spaCBA pili cluster arrangement. Genes and their orientations are depicted with arrows.

https://doi.org/10.1371/journal.pone.0075073.s005

(EPS)

Table S1.

General genomic features of strains sequenced in this study.

https://doi.org/10.1371/journal.pone.0075073.s006

(PDF)

Acknowledgments

We thank K. Furuya, C. Shindo, H. Inaba, K. Motomura, and Y. Hattori (The University of Tokyo), and A. Tamura and N. Itoh (Kitasato University) for technical assistance, and Dr. H. Zhang for supplying L. paracasei Zhang.

Author Contributions

Conceived and designed the experiments: HM. Performed the experiments: AN MT TT HN SI. Analyzed the data: HT KO MM MH HM. Contributed reagents/materials/analysis tools: AN. Wrote the paper: HT HM.

References

  1. 1. Felis GE, Dellaglio F (2007) Taxonomy of Lactobacilli and Bifidobacteria. Curr Issues Intest Microbiol 8: 44–61.
  2. 2. Dicks LM, Du Plessis EM, Dellaglio F, Lauer E (1996) Reclassification of Lactobacillus casei subsp. casei ATCC 393 and Lactobacillus rhamnosus ATCC 15820 as Lactobacillus zeae nom. rev., designation of ATCC 334 as the neotype of L. casei subsp. casei, and rejection of the name Lactobacillus paracasei. Int J Syst Bacteriol 46: 337–340.
  3. 3. Felis GE, Dellaglio F, Mizzi L, Torriani S (2001) Comparative sequence analysis of a recA gene fragment brings new evidence for a change in the taxonomy of the Lactobacillus casei group. Int J Syst Evol Microbiol 51: 2113–2117.
  4. 4. Dellaglio F, Felis GE, Torriani S (2002) The status of the species Lactobacillus casei (Orla-Jensen 1916) Hansen and Lessel 1971 and Lactobacillus paracasei Collins et al. 1989. Request for an opinion. Int J Syst Evol Microbiol 52: 285–287.
  5. 5. Acedo–Félix E, Pérez–Martínez G (2003) Significant differences between Lactobacillus casei subsp. casei ATCC 393T and a commonly used plasmid-cured derivative revealed by a polyphasic study. Int J Syst Evol Microbiol 53: 67–75.
  6. 6. Diancourt L, Passet V, Chervaux C, Garault P, Smokvina T, et al. (2007) Multilocus sequence typing of Lactobacillus casei reveals a clonal population structure with low levels of homologous recombination. Appl Environ Microbiol 73: 6601–6611.
  7. 7. Judicial Commission of the International Committee on Systematics of Bacteria (2008) The type strain of Lactobacillus casei is ATCC 393, ATCC 334 cannot serve as the type because it represents a different taxon, the name Lactobacillus paracasei and its subspecies names are not rejected and the revival of the name ‘Lactobacillus zeae’ contravenes Rules 51b (1) and (2) of the International Code of Nomenclature of Bacteria. Opinion 82. Int J Syst Evol Microbiol 58: 1764–1765.
  8. 8. Makarova K, Slesarev A, Wolf Y, Sorokin A, Mirkin B, et al. (2006) Comparative genomics of the lactic acid bacteria. Proc Natl Acad Sci U S A 103: 15611–15616.
  9. 9. Kankainen M, Paulin L, Tynkkynen S, von Ossowski I, Reunanen J, et al. (2009) Comparative genomic analysis of Lactobacillus rhamnosus GG reveals pili containing a human– mucus binding protein. Proc Natl Acad Sci U S A 106: 17193–17198.
  10. 10. Mazé A, Boël G, Zúñiga M, Bourand A, Loux V, et al. (2010) Complete genome sequence of the probiotic Lactobacillus casei strain BL23. J Bacteriol 192: 2647–2648.
  11. 11. Zhang W, Yu D, Sun Z, Wu R, Chen X, et al. (2010) Complete genome sequence of Lactobacillus casei Zhang, a new probiotic strain isolated from traditional homemade koumiss in Inner Mongolia, China. J Bacteriol 192: 5268–5269.
  12. 12. Ai L, Chen C, Zhou F, Wang L, Zhang H, et al. (2011) Complete genome sequence of the probiotic strain Lactobacillus casei BD-II. J Bacteriol 193: 3160–3161.
  13. 13. Chen C, Ai L, Zhou F, Wang L, Zhang H, et al. (2011) Complete genome sequence of the probiotic bacterium Lactobacillus casei LC2W. J Bacteriol 193: 3419–3420.
  14. 14. Pittet V, Ewen E, Bushell BR, Ziola B (2012) Genome sequence of Lactobacillus rhamnosus ATCC 8530. J Bacteriol 194: 726.
  15. 15. Morita H, Toh H, Oshima K, Murakami M, Taylor TD, et al. (2009) Complete genome sequence of the probiotic Lactobacillus rhamnosus ATCC 53103. J Bacteriol 191: 7630–7631.
  16. 16. Doron S, Snydman DR, Gorbach SL (2005) Lactobacillus GG: bacteriology and clinical applications. Gastroenterol Clin North Am 34: 483–498.
  17. 17. Mahenthiralingam E, Marchbank A, Drevinek P, Garaiova I, Plummer S (2009) Use of colony–based bacterial strain typing for tracking the fate of Lactobacillus strains during human consumption. BMC Microbiol 9: 251.
  18. 18. Morita H, Kuwahara T, Ohshima K, Sasamoto H, Itoh K, et al. (2007) An improved DNA isolation method for metagenomic analysis of the microbial flora of the human intestine. Microbes Environ 22: 214–222.
  19. 19. Delcher AL, Harmon D, Kasif S, White O, Salzberg SL (1999) Improved microbial gene identification with GLIMMER. Nucleic Acids Res 27: 4636–4641.
  20. 20. Lowe TM, Eddy SR (1997) tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res 25: 955–964.
  21. 21. Petersen TN, Brunak S, von Heijne G, Nielsen H (2011) SignalP 4.0: discriminating signal peptides from transmembrane regions. Nat Methods 8: 785–786.
  22. 22. Grissa I, Vergnaud G, Pourcel C (2007) CRISPRFinder: a web tool to identify clustered regularly interspaced short palindromic repeats. Nucleic Acids Res 35: W52–57.
  23. 23. Rasko DA, Myers GS, Ravel J (2005) Visualization of comparative genomic analyses by BLAST score ratio. BMC Bioinformatics 6: 2.
  24. 24. Cai H, Rodríguez BT, Zhang W, Broadbent JR, Steele JL (2007) Genotypic and phenotypic characterization of Lactobacillus casei strains isolated from different ecological niches suggests frequent recombination and niche specificity. Microbiology 153: 2655–2665.
  25. 25. Broadbent JR, Neeno-Eckwall EC, Stahl B, Tandee K, Cai H, et al. (2012) Analysis of the Lactobacillus casei supragenome and its influence in species evolution and lifestyle adaptation. BMC Genomics 13: 533.
  26. 26. Horvath P, Coûté–Monvoisin AC, Romero DA, Boyaval P, Fremaux C, et al. (2009) Comparative analysis of CRISPR loci in lactic acid bacteria genomes. Int J Food Microbiol 131: 62–70.
  27. 27. Cai H, Thompson R, Budinich MF, Broadbent JR, Steele JL (2009) Genome sequence and comparative genome analysis of Lactobacillus casei: Insights into their niche–associated evolution. Genome Biol Evol 1: 239–257.
  28. 28. Makarova KS, Koonin EV (2007) Evolutionary genomics of lactic acid bacteria. J Bacteriol 189: 1199–1208.
  29. 29. Schell MA, Karmirantzou M, Snel B, Vilanova D, Berger B, et al. (2002) The genome sequence of Bifidobacterium longum reflects its adaptation to the human gastrointestinal tract. Proc. Natl. Acad. Sci U S A 99: 14422–14427.
  30. 30. Reunanen J, von Ossowski I, Hendrickx AP, Palva A, de Vos WM (2012) Characterization of the SpaCBA pilus fibers in the probiotic Lactobacillus rhamnosus GG. Appl Environ Microbiol 78: 2337–2344.
  31. 31. Douillard FP, Ribbera A, Järvinen HM, Kant R, Pietilä TE, et al. (2013) Comparative genomic and functional analysis of Lactobacillus casei and Lactobacillus rhamnosus strains marketed as probiotics. Appl Environ Microbiol 79: 1923–1933.
  32. 32. Lee YK, Lim CY, Teng WL, Ouwehand AC, Tuomola EM (2000) Quantitative approach in the study of adhesion of lactic acid bacteria to intestinal cells and their competition with enterobacteria. Appl Environ Microbiol 66: 3692–3697.
  33. 33. Kleerebezem M, Boekhorst J, van Kranenburg R, Molenaar D, Kuipers OP, et al. (2003) Complete genome sequence of Lactobacillus plantarum WCFS1. Proc Natl Acad Sci U S A 100: 1990–1995.
  34. 34. Brinster S, Furlan S, Serror P (2007) C–terminal WxL domain mediates cell wall binding in Enterococcus faecalis and other gram–positive bacteria. J Bacteriol 189: 1244–1253.
  35. 35. Siezen R, Boekhorst J, Muscariello L, Molenaar D, Renckens B, et al. (2006) Lactobacillus plantarum gene clusters encoding putative cell-surface protein complexes for carbohydrate utilization are conserved in specific gram-positive bacteria. BMC Genomics 7: 126.
  36. 36. Tettelin H, Nelson KE, Paulsen IT, Eisen JA, Read TD, et al. (2001) Complete genome sequence of a virulent isolate of Streptococcus pneumoniae. Science 293: 498–506.
  37. 37. Pridmore RD, Berger B, Desiere F, Vilanova D, Barretto C, et al. (2004) The genome sequence of the probiotic intestinal bacterium Lactobacillus johnsonii NCC 533. Proc Natl Acad Sci U S A 101: 2512–2517.
  38. 38. Vélez MP, Petrova MI, Lebeer S, Verhoeven TL, Claes I, et al. (2010) Characterization of MabA, a modulator of Lactobacillus rhamnosus GG adhesion and biofilm formation. FEMS Immunol Med Microbiol 59: 386–398.