Gene set meta-analysis with Quantitative Set Analysis for Gene Expression (QuSAGE)

Hailong Meng; Gur Yaari; Christopher R. Bolen; Stefan Avey; Steven H. Kleinstein

doi:10.1371/journal.pcbi.1006899

Abstract

Small sample sizes combined with high person-to-person variability can make it difficult to detect significant gene expression changes from transcriptional profiling studies. Subtle, but coordinated, gene expression changes may be detected using gene set analysis approaches. Meta-analysis is another approach to increase the power to detect biologically relevant changes by integrating information from multiple studies. Here, we present a framework that combines both approaches and allows for meta-analysis of gene sets. QuSAGE meta-analysis extends our previously published QuSAGE framework, which offers several advantages for gene set analysis, including fully accounting for gene-gene correlations and quantifying gene set activity as a full probability density function. Application of QuSAGE meta-analysis to influenza vaccination response shows it can detect significant activity that is not apparent in individual studies.

Figures

Citation: Meng H, Yaari G, Bolen CR, Avey S, Kleinstein SH (2019) Gene set meta-analysis with Quantitative Set Analysis for Gene Expression (QuSAGE). PLoS Comput Biol 15(4): e1006899. https://doi.org/10.1371/journal.pcbi.1006899

Editor: Mihaela Pertea, Johns Hopkins University, UNITED STATES

Received: July 17, 2018; Accepted: February 24, 2019; Published: April 2, 2019

Copyright: © 2019 Meng et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: The data and R code can be found at: https://bitbucket.org/kleinstein/qusage.

Funding: This work has been supported by National Institutes of Science (NIH) grant U19AI117873 Grant website: https://www.nih.gov/grants-funding Steven H. Kleinstein and United States–Israel Binational Science Foundation grant 2013395 Grant website: http://www.bsf.org.il/BSFPublic/Default.aspx PIs: Steven H. Kleinstein & Gur Yaari The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: I have read the journal’s policy and the authors of this manuscript have the following competing interests: S.A. received personal fees from Janssen R&D while writing this manuscript. C.R.B. reports employment and equity ownership for Genentech.

This is a PLOS Computational Biology Software paper.

Introduction

Whole-genome transcriptional profiling, using DNA microarray technology or next-generation sequencing (RNA-seq), is widely used to gain insights into disease pathophysiology and response to therapy. While it is important to identify individual genetic associations, the high level of variation between individuals due to genetic and phenotypic heterogeneity can result in inconsistent biological insights [1]. With the availability of biological annotation for known genes [2–5], the focus of gene analysis has shifted from individual genes to gene sets. Gene set analysis can be used to detect and compare the activity of pre-defined lists of genes that can be related directly to the underlying biological processes. Compared to differential expression (DE) analysis of individual genes, gene set analysis examines the cumulative effect of multiple related genes, and thus offers the possibility to detect more subtle, but coordinated, expression changes [6–10]. Despite this increased power, gene set analysis can still be limited by the small sample sizes of many current studies. Combining multiple related studies through meta-analysis offers the possibility of increased power and improved reproducibility [11]. Such studies can leverage the large and growing number of transcriptional profiling data sets available in public repositories, such as GEO [12]. However, combining information from multiple studies and performing meta-analysis at the gene set level remains challenging. Meta-Analysis of Pathway Enrichment (MAPE), including MAPE-P, MAPE-G, and MAPE-I, use maximum, minimum, or Fisher’s statistics to combine P values from each individual study for meta-analysis [13]. Instead of combining P values, MetaPath leverages a Bayesian model and was developed to perform gene set meta-analysis by simultaneously modeling gene expression data and gene set information from multiple studies [14]. Recently, Lu et al. developed iGSEA that uses an adaptive testing method for choosing either random Effects (RE) or fixed effects (FE) model to integrate gene set analysis from multiple studies [15].

We previously proposed Quantitative Set Analysis for Gene Expression (QuSAGE) [16] as a computational framework for gene set analysis. QuSAGE quantifies gene set activity with a complete probability density function (PDF), and improves power by accounting for gene-gene correlations. The QuSAGE R package is available on Bioconductor [17], and is widely used with 1554 downloads from distinct IPs in 2017. In 2015, Turner et al. extended the applicability of QuSAGE to longitudinal studies by adding functionality for general linear mixed models [18]. In this study, we further extend the applicability of QuSAGE to include meta-analysis of gene sets. QuSAGE meta-analysis was adopted by the NIH/NIAID Human Immunology Project Consortium (HIPC)–Center for Human Immunology (CHI) Signature Project Team to successfully detect baseline transcriptional predictors of influenza vaccination responses from multiple studies [19].

As an alternative gene set meta-analysis method, QuSAGE meta-analysis has several advantages: 1) It is a natural extension of QuSAGE, so it facilitates gene set meta-analysis for the large number of existing QuSAGE users, 2) QuSAGE improves power by accounting for gene-gene correlations and QuSAGE meta-analysis inherits this advantage, and 3) Since QuSAGE quantifies a gene set activity with a PDF, it is capable of performing complicated post hoc comparisons that other gene set meta-analysis methods cannot achieve easily, as we demonstrate in our case study.

Design & implementation

QuSAGE quantifies gene set activity with a complete probability density function (PDF). The QuSAGE meta-analysis pipeline proceeds in three steps (Fig 1).

Download:

Fig 1. Overview of the QuSAGE meta-analysis pipeline.

Gene expression data of each study is first analyzed separately by QuSAGE to produce gene set activity PDFs. Next, meta-analysis is performed through the function combinePDFs, where PDFs from each individual study are combined into a single PDF using a weighted numeric convolution algorithm. The results of QuSAGE meta-analysis can then be visualized by the function plotCombinedPDF.

https://doi.org/10.1371/journal.pcbi.1006899.g001

Frist, gene set analysis is performed with gene expression data separately for each individual study using QuSAGE. Differential gene expression of individual gene is quantified by a full PDF rather than a single P value. Then all PDFs of genes within the gene set of interest are combined into a single activity (PDF) using numerical convolution. The variance of the combined PDF is corrected for gene-gene correlation by calculating a variance inflation factor (VIF).

Next, the meta-analysis is performed through the function combinePDFs (Table 1). To carry out meta-analysis of S studies, the PDFs from each individual study are combined into a single PDF using a weighted numeric convolution algorithm [20]. The sample sizes of each study are considered as weight factors. In short, the continuous PDFs are sampled within an interval that spans their individual ranges. Each PDF is sampled by a finite number of points that is proportional to its weight. These discretized PDFs are then convoluted and the result is resampled and transformed back to the initial interval. P values and confidence intervals can be easily extracted from the resulting combined PDF.

Download:

Table 1. Pseudocode for QuSAGE meta-analysis.

https://doi.org/10.1371/journal.pcbi.1006899.t001

Finally, the results of QuSAGE meta-analysis can be visualized by the function plotCombinedPDF.

Results

To illustrate how QuSAGE meta-analysis works, we analyzed three influenza vaccination transcriptional profiling studies of young adults [21]. The data from these studies is available in GEO (GSE59635, GSE59654, and GSE59743) and ImmPort (SDY63, SDY404, and SDY400). The goal of the analysis was to detect gene sets associated with successful (i.e., high) antibody responses using the transcriptional response data measured from blood samples taken pre- and 7 days post-vaccination. Subjects were categorized as high-responders (HR) and low-responders (LR) based on their adjusted maximum fold change (adjMFC) from hemagglutination inhibition assay (HAI) measurements taken pre- and 28 days post-vaccination [22]. GSE59635 (SDY63) included 7 young subjects (3 LR and 4 HR); GSE59654 (SDY404) contained 13 young subjects (7 LR and 6 HR); GSE59743 (SDY400) had 15 young subjects (7 LR and 8 HR). The data and R code of this case study can be found from: https://bitbucket.org/kleinstein/qusage.

The analysis consisted of two major steps:

Identify candidate vaccination response gene sets. First, the set of 346 blood transcription modules (BTMs) described in Li et al. [4] was filtered to a smaller list of “response” sets that showed significant activity following influenza vaccination in the set of HR subjects. To define these response gene sets, QuSAGE meta-analysis was used to compare day 7 post-vaccination with pre-vaccination transcriptional profiles in HR subjects across all three studies. This analysis identified 62 response gene sets with a Benjamani-Hochberg false discovery rate (FDR) cutoff of 5%.
Detect gene sets associated with successful antibody responses. For each response gene set selected in step 1, QuSAGE was first used to carry out a two-way comparison on each study independently. A PDF reflecting the response difference between HR and LR was quantified by calculating the difference of two PDFs, one representing the temporal gene set activity in HR (day 7 vs. pre-vaccination) and the other representing LR (day 7 vs. pre-vaccination). Next, QuSAGE meta-analysis was used to combine the PDFs from the three studies into one single PDF. Statistical significance of the meta-analysis was calculated by testing whether the central tendency of the final PDF is zero using a two-sided test with 15% FDR cutoff.

As expected from the known biology, "plasma cells, immunoglobulins (M156.1)" was one of top-ranked gene sets from QuSAGE meta-analysis (Fig 2), and was significantly more up-regulated (day 7 vs. pre-vaccination) in HR compared to LR. In total, QuSAGE meta-analysis identified 11 gene sets associated with a successful antibody response (Table 2). In most cases (8 of 11; 73%), the QuSAGE meta-analysis of these gene sets yielded a lower P value compared with the individual studies.

Download:

Fig 2. QuSAGE meta-analysis of gene set “plasma cells, immunoglobulins (M156.1)”.

The differential response between HR and LR subjects was first calculated for each individual study (colored lines). QuSAGE meta-analysis was then used to combine these individual PDFs into a single meta-analysis PDF (black line).

https://doi.org/10.1371/journal.pcbi.1006899.g002

Download:

Table 2. Nominal P values for individual studies and meta-analyses of gene sets significantly associated with successful influenza vaccination responses (FDR < 15%).

https://doi.org/10.1371/journal.pcbi.1006899.t002

We next compared QuSAGE meta-analysis with other meta-analysis approaches. Existing gene set meta-analysis methods were designed to perform pairwise comparisons between two phenotypes/conditions and cannot be easily applied to the four-way comparison in our case study. For our comparative analysis, we first used Fisher’s method [23] and Stouffer’s method [24] to combine P values from QuSAGE single gene set analysis from each study and compared the results with QuSAGE meta-analysis. Using the same FDR cutoff of 15%, Fisher’s method and Stouffer’s method identified fewer gene sets than QuSAGE. Fisher’s method and Stouffer’s method identified 4 and 1 significant gene sets, respectively, including only a single gene set not found by QuSAGE (Fig 3A, Table 2). It is possible that QuSAGE meta-analysis was more sensitive, and identified additional significant gene sets, compared with Fisher’s method or Stouffer’s method at the cost of decreased specificity. To investigate the specificity of QuSAGE meta-analysis, we permutated the labels of LR and HR individuals 2000 times and applied the same meta-analyses using all three approaches. With the same FDR cutoff 15% applied to each permutation, only 134 out of 2000 permutations generated even a single false positive gene set result using QuSAGE meta-analysis; while 380 and 384 permutations produced false positives when using Fisher’s and Stouffer’s method, respectively (Fig 3B). These results suggest that QuSAGE meta-analysis is conservative and the increased number of significant gene sets identified by QuSAGE in the real data was not due to QuSAGE simply generating lower P values (i.e., QuSAGE meta-analysis is not trading off specificity for sensitivity).

Download:

Fig 3. Comparison of QuSAGE with Fisher’s method and Stouffer’s method.

A) Significant genes sets identified by QuSAGE meta-analysis, Fisher’s method and Stouffer’s method. Using the same FDR cutoff of 15%, QuSAGE meta-analysis, Fisher’s method and Stouffer’s method identified 11, 4 and 1 significant gene sets respectively. B) Permutation analysis of QuSAGE meta-analysis demonstrates higher specificity than Fisher’s method and Stouffer’s method. The labels of LR and HR subjects were permutated 2000 times, and meta-analysis was carried out for each of these permuted data sets. For each permutation, the number of false positive gene sets (defined at FDR < 15%) was determined for QuSAGE meta-analysis, Fisher’s method and Stouffer’s method (left, middle and right panels, respectively). The counts of permutations with and without any false positive results is indicated in the pie charts.

https://doi.org/10.1371/journal.pcbi.1006899.g003

However, a limitation of Fisher’s method and Stouffer’s method is that neither accounts for the direction of gene set activity (e.g., higher in HR vs. higher in LR), but simply combines the resulting P values from each individual study. As a consequence, low P values may be produced by cases where the change for the individual studies is significant but in different directions, leading to false positives. To account for the directionality of gene set activity differences when applying Fisher’s method and Stouffer’s method, we carried out a three-step analysis, which were referred to directional Fisher’s method and directional Stouffer’s method. First, separate one-tailed tests were carried out for each study to test for (1) higher gene set activity in HR, and (2) higher gene set activity in LR. In this way, lower P values in each type of one-tailed test, have a consistent meaning. Second, in the meta-analysis, Fisher’s method or Stouffer’s method was applied to the set of P values from each type of one-tailed test to generate a combined P values. Third, the final P value of the meta-analysis was the smaller of the two combined P value from each of the one-tailed tests, corrected by multiplying by 2. We also tested another popular meta-analysis method in which effect sizes (Hedges’ g) are calculated for every gene set in each study separately and then combined using linear (mixed-effects) models (implemented in the rma() function from the metafor R package, and hereafter referred to as the “effect-size” method) [25]. Using the same FDR cutoff of 15%, directional Fisher’s method, Stouffer’s method and the effect-size method identified 16, 27 and 40 significant gene sets respectively (S1 Table). All 11 gene sets detected by QuSAGE meta-analysis were found by directional Fisher’s method and directional Stouffer’s method, and 10 of the 11 gene sets were found by the effect-size method, suggesting a high level of confidence in the QuSAGE results (Fig 4A). To quantify the specificity of the three approaches, we permutated the labels of LR and HR individuals 2000 times and applied the same meta-analyses on each permuted data set. With the same FDR cutoff 15% applied to each permutation, QuSAGE meta-analysis generated false positive results in only 8% (159 out of 2000) of the permutations (Fig 4B). In contrast, directional Fisher’s method, directional Stouffer’s method and the effect-size method generated at least one false positive gene set in 17%, 14% and 63% (337, 280 and 1267 out of 2000) of the permutations, respectively (Fig 4B).This higher false positive rate may account, at least partially, for the additional gene sets identified by directional Fisher’s method, directional Stouffer’s method and the effect-size method. Overall, the results on this case study show that QuSAGE meta-analysis is comparable with existing methods, but has better specificity.

Download:

Fig 4. Comparison of QuSAGE with directional Fisher’s method, directional Stouffer’s method and the effect-size method.

A) Significant genes sets identified by QuSAGE meta-analysis, directional Fisher’s method, directional Stouffer’s method and the effect-size method. Using the same FDR cutoff of 15%, QuSAGE meta-analysis, directional Fisher’s method, directional Stouffer’s method and the effect-size method identified 11, 16, 27 and 40 significant gene sets respectively. B) Permutation analysis of QuSAGE meta-analysis demonstrates higher specificity than directional Fisher’s method, directional Stouffer’s method and effect-size method. The labels of LR and HR subjects were permutated 2000 times, and meta-analysis was carried out for each of these permuted data sets. For each permutation, the number of false positive gene sets (defined at FDR < 15%) was determined for QuSAGE meta-analysis, directional Fisher’s method, directional Stouffer’s method and the effect-size method. The counts of permutations with and without any false positive results is indicated in the pie charts.

https://doi.org/10.1371/journal.pcbi.1006899.g004

In this study, we describe an extension of QuSAGE to enable meta-analysis of gene sets. Instead of summarizing P values, QuSAGE integrates gene set activity and estimates a full PDF of activity across multiple studies, thus easing the process of post hoc comparisons. Furthermore, by integrating information from a larger pool of samples, QuSAGE meta-analysis increases the power of analysis, and allows detection of biologically-relevant gene sets that would not be detectable in single studies. Existing common meta-analysis methods, such as Fisher’s method, Stouffer’s method, or the effect-size method, are limited by the fact that the gene set activity from each study is represented by a single P value (Stouffer weighs P values by sample size from each study) or a single statistic (effect size). However, QuSAGE describes the gene set activity using a PDF and the meta-analysis of QuSAGE fully takes the advantage of the richer information provided from PDFs. QuSAGE meta-analysis combines PDFs from multiple studies using a weighted numeric convolution algorithm, and thus implicitly considers not only the differences but also directions and confidence intervals of gene set activities, leading to a more accurate estimation of combined gene set activity. The QuSAGE algorithm is also computationally efficient. It took totally only 4 minutes to run the whole case study in our manuscript on a single PC with a 2.80GHz Intel Core i7 CPU and 16G memory. Our case study suggests that QuSAGE is comparable or better than the commonly used Fisher and Stouffer methods. In the future, performing comparisons of QuSAGE with other existing meta-analysis methods [13–15, 26]would be desirable.

Availability and Future Directions

The QuSAGE R package is available in Bioconductor and can be accessed from: http://bioconductor.org/packages/release/bioc/html/qusage.html. QuSAGE meta-analysis is included in version 2.12.0 or later. The data and R code of this case study can be found from: https://bitbucket.org/kleinstein/qusage.

Supporting information

S1 Table. Nominal P values of gene sets significantly associated with successful influenza vaccination responses from four meta-analysis approaches.

https://doi.org/10.1371/journal.pcbi.1006899.s001

(DOCX)

References

1. Thomassen M, Tan Q, Kruse TA. Gene expression meta-analysis identifies metastatic pathways and transcription factors in breast cancer. BMC cancer. 2008;8:394. Epub 2009/01/01. pmid:19116006.
- View Article
- PubMed/NCBI
- Google Scholar
2. Kanehisa M, Sato Y, Kawashima M, Furumichi M, Tanabe M. KEGG as a reference resource for gene and protein annotation. Nucleic acids research. 2016;44(D1):D457–62. Epub 2015/10/18. pmid:26476454.
- View Article
- PubMed/NCBI
- Google Scholar
3. Croft D, Mundo AF, Haw R, Milacic M, Weiser J, Wu G, et al. The Reactome pathway knowledgebase. Nucleic acids research. 2014;42(D1):D472–D7.
- View Article
- Google Scholar
4. Li S, Rouphael N, Duraisingham S, Romero-Steiner S, Presnell S, Davis C, et al. Molecular signatures of antibody responses derived from a systems biology study of five human vaccines. Nature immunology. 2014;15(2):195–204. Epub 2013/12/18. pmid:24336226
- View Article
- PubMed/NCBI
- Google Scholar
5. Liberzon A, Subramanian A, Pinchback R, Thorvaldsdottir H, Tamayo P, Mesirov JP. Molecular signatures database (MSigDB) 3.0. Bioinformatics (Oxford, England). 2011;27(12):1739–40. Epub 2011/05/07. pmid:21546393.
- View Article
- PubMed/NCBI
- Google Scholar
6. Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proceedings of the National Academy of Sciences of the United States of America. 2005;102(43):15545–50. Epub 2005/10/04. pmid:16199517
- View Article
- PubMed/NCBI
- Google Scholar
7. Barry WT, Nobel AB, Wright FA. Significance analysis of functional categories in gene expression studies: a structured permutation approach. Bioinformatics (Oxford, England). 2005;21(9):1943–9. Epub 2005/01/14. pmid:15647293.
- View Article
- PubMed/NCBI
- Google Scholar
8. Huang da W, Sherman BT, Lempicki RA. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nature protocols. 2009;4(1):44–57. Epub 2009/01/10. pmid:19131956.
- View Article
- PubMed/NCBI
- Google Scholar
9. Ackermann M, Strimmer K. A general modular framework for gene set enrichment analysis. BMC bioinformatics. 2009;10:47. Epub 2009/02/05. pmid:19192285.
- View Article
- PubMed/NCBI
- Google Scholar
10. Goeman JJ, Buhlmann P. Analyzing gene expression data in terms of gene sets: methodological issues. Bioinformatics (Oxford, England). 2007;23(8):980–7. Epub 2007/02/17. pmid:17303618.
- View Article
- PubMed/NCBI
- Google Scholar
11. Sweeney TE, Haynes WA, Vallania F, Ioannidis JP, Khatri P. Methods to increase reproducibility in differential gene expression via meta-analysis. Nucleic acids research. 2017;45(1):e1. Epub 2016/09/17. pmid:27634930.
- View Article
- PubMed/NCBI
- Google Scholar
12. Edgar R, Domrachev M, Lash AE. Gene Expression Omnibus: NCBI gene expression and hybridization array data repository. Nucleic acids research. 2002;30(1):207–10. Epub 2001/12/26. pmid:11752295.
- View Article
- PubMed/NCBI
- Google Scholar
13. Shen K, Tseng GC. Meta-analysis for pathway enrichment analysis when combining multiple genomic studies. Bioinformatics (Oxford, England). 2010;26(10):1316–23. Epub 2010/04/23. pmid:20410053.
- View Article
- PubMed/NCBI
- Google Scholar
14. Chen M, Zang M, Wang X, Xiao G. A powerful Bayesian meta-analysis method to integrate multiple gene set enrichment studies. Bioinformatics (Oxford, England). 2013;29(7):862–9. Epub 2013/02/19. pmid:23418184.
- View Article
- PubMed/NCBI
- Google Scholar
15. Lu W, Wang X, Zhan X, Gazdar A. Meta-analysis approaches to combine multiple gene set enrichment studies. Statistics in medicine. 2018;37(4):659–72. Epub 2017/10/21. pmid:29052247.
- View Article
- PubMed/NCBI
- Google Scholar
16. Yaari G, Bolen CR, Thakar J, Kleinstein SH. Quantitative set analysis for gene expression: a method to quantify gene set differential expression including gene-gene correlations. Nucleic acids research. 2013;41(18):e170. Epub 2013/08/08. pmid:23921631
- View Article
- PubMed/NCBI
- Google Scholar
17. Huber W, Carey VJ, Gentleman R, Anders S, Carlson M, Carvalho BS, et al. Orchestrating high-throughput genomic analysis with Bioconductor. Nature methods. 2015;12(2):115–21. Epub 2015/01/31. pmid:25633503
- View Article
- PubMed/NCBI
- Google Scholar
18. Turner JA, Bolen CR, Blankenship DM. Quantitative gene set analysis generalized for repeated measures, confounder adjustment, and continuous covariates. BMC bioinformatics. 2015;16:272. Epub 2015/09/01. pmid:26316107
- View Article
- PubMed/NCBI
- Google Scholar
19. HIPC-CHI Signatures Project Team, HIPC-I Consortium. Multicohort analysis reveals baseline transcriptional predictors of influenza vaccination responses. Science immunology. 2017;2(14). Epub 2017/08/27. pmid:28842433.
- View Article
- PubMed/NCBI
- Google Scholar
20. Yaari G, Uduman M, Kleinstein SH. Quantifying selection in high-throughput Immunoglobulin sequencing data sets. Nucleic acids research. 2012;40(17):e134. Epub 2012/05/30. pmid:22641856
- View Article
- PubMed/NCBI
- Google Scholar
21. Thakar J, Mohanty S, West AP, Joshi SR, Ueda I, Wilson J, et al. Aging-dependent alterations in gene expression and a mitochondrial signature of responsiveness to human influenza vaccination. Aging. 2015;7(1):38–52. Epub 2015/01/19. pmid:25596819
- View Article
- PubMed/NCBI
- Google Scholar
22. Tsang JS, Schwartzberg PL, Kotliarov Y, Biancotto A, Xie Z, Germain RN, et al. Global analyses of human immune variation reveal baseline predictors of postvaccination responses. Cell. 2014;157(2):499–513. Epub 2014/04/15. pmid:24725414
- View Article
- PubMed/NCBI
- Google Scholar
23. Mosteller F, Fisher R. Questions and answers #14. The American Statistician. 1948;2(5):30–1.
- View Article
- Google Scholar
24. Stouffer S, Suchman E, DeVinney L, Star S, Williams R Adjustment during Army Life. The American Soldier. 1949;1.
25. Viechtbauer W. Conducting Meta-Analyses in R with the metafor Package. J Stat Softw. 2010;36:1–48.
- View Article
- Google Scholar
26. Li JaT G. An adaptively weighted statistic for detecting differential gene expression when combining multiple transcriptomic studies. The Annals of Applied Statistics. 2011;5:994–1019.
- View Article
- Google Scholar

[ref1] 1. Thomassen M, Tan Q, Kruse TA. Gene expression meta-analysis identifies metastatic pathways and transcription factors in breast cancer. BMC cancer. 2008;8:394. Epub 2009/01/01. pmid:19116006.
View Article
PubMed/NCBI
Google Scholar

[2] View Article

[3] PubMed/NCBI

[4] Google Scholar

[ref2] 2. Kanehisa M, Sato Y, Kawashima M, Furumichi M, Tanabe M. KEGG as a reference resource for gene and protein annotation. Nucleic acids research. 2016;44(D1):D457–62. Epub 2015/10/18. pmid:26476454.
View Article
PubMed/NCBI
Google Scholar

[6] View Article

[7] PubMed/NCBI

[8] Google Scholar

[ref3] 3. Croft D, Mundo AF, Haw R, Milacic M, Weiser J, Wu G, et al. The Reactome pathway knowledgebase. Nucleic acids research. 2014;42(D1):D472–D7.
View Article
Google Scholar

[10] View Article

[11] Google Scholar

[ref4] 4. Li S, Rouphael N, Duraisingham S, Romero-Steiner S, Presnell S, Davis C, et al. Molecular signatures of antibody responses derived from a systems biology study of five human vaccines. Nature immunology. 2014;15(2):195–204. Epub 2013/12/18. pmid:24336226
View Article
PubMed/NCBI
Google Scholar

[13] View Article

[14] PubMed/NCBI

[15] Google Scholar

[ref5] 5. Liberzon A, Subramanian A, Pinchback R, Thorvaldsdottir H, Tamayo P, Mesirov JP. Molecular signatures database (MSigDB) 3.0. Bioinformatics (Oxford, England). 2011;27(12):1739–40. Epub 2011/05/07. pmid:21546393.
View Article
PubMed/NCBI
Google Scholar

[17] View Article

[18] PubMed/NCBI

[19] Google Scholar

[ref6] 6. Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proceedings of the National Academy of Sciences of the United States of America. 2005;102(43):15545–50. Epub 2005/10/04. pmid:16199517
View Article
PubMed/NCBI
Google Scholar

[21] View Article

[22] PubMed/NCBI

[23] Google Scholar

[ref7] 7. Barry WT, Nobel AB, Wright FA. Significance analysis of functional categories in gene expression studies: a structured permutation approach. Bioinformatics (Oxford, England). 2005;21(9):1943–9. Epub 2005/01/14. pmid:15647293.
View Article
PubMed/NCBI
Google Scholar

[25] View Article

[26] PubMed/NCBI

[27] Google Scholar

[ref8] 8. Huang da W, Sherman BT, Lempicki RA. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nature protocols. 2009;4(1):44–57. Epub 2009/01/10. pmid:19131956.
View Article
PubMed/NCBI
Google Scholar

[29] View Article

[30] PubMed/NCBI

[31] Google Scholar

[ref9] 9. Ackermann M, Strimmer K. A general modular framework for gene set enrichment analysis. BMC bioinformatics. 2009;10:47. Epub 2009/02/05. pmid:19192285.
View Article
PubMed/NCBI
Google Scholar

[33] View Article

[34] PubMed/NCBI

[35] Google Scholar

[ref10] 10. Goeman JJ, Buhlmann P. Analyzing gene expression data in terms of gene sets: methodological issues. Bioinformatics (Oxford, England). 2007;23(8):980–7. Epub 2007/02/17. pmid:17303618.
View Article
PubMed/NCBI
Google Scholar

[37] View Article

[38] PubMed/NCBI

[39] Google Scholar

[ref11] 11. Sweeney TE, Haynes WA, Vallania F, Ioannidis JP, Khatri P. Methods to increase reproducibility in differential gene expression via meta-analysis. Nucleic acids research. 2017;45(1):e1. Epub 2016/09/17. pmid:27634930.
View Article
PubMed/NCBI
Google Scholar

[41] View Article

[42] PubMed/NCBI

[43] Google Scholar

[ref12] 12. Edgar R, Domrachev M, Lash AE. Gene Expression Omnibus: NCBI gene expression and hybridization array data repository. Nucleic acids research. 2002;30(1):207–10. Epub 2001/12/26. pmid:11752295.
View Article
PubMed/NCBI
Google Scholar

[45] View Article

[46] PubMed/NCBI

[47] Google Scholar

[ref13] 13. Shen K, Tseng GC. Meta-analysis for pathway enrichment analysis when combining multiple genomic studies. Bioinformatics (Oxford, England). 2010;26(10):1316–23. Epub 2010/04/23. pmid:20410053.
View Article
PubMed/NCBI
Google Scholar

[49] View Article

[50] PubMed/NCBI

[51] Google Scholar

[ref14] 14. Chen M, Zang M, Wang X, Xiao G. A powerful Bayesian meta-analysis method to integrate multiple gene set enrichment studies. Bioinformatics (Oxford, England). 2013;29(7):862–9. Epub 2013/02/19. pmid:23418184.
View Article
PubMed/NCBI
Google Scholar

[53] View Article

[54] PubMed/NCBI

[55] Google Scholar

[ref15] 15. Lu W, Wang X, Zhan X, Gazdar A. Meta-analysis approaches to combine multiple gene set enrichment studies. Statistics in medicine. 2018;37(4):659–72. Epub 2017/10/21. pmid:29052247.
View Article
PubMed/NCBI
Google Scholar

[57] View Article

[58] PubMed/NCBI

[59] Google Scholar

[ref16] 16. Yaari G, Bolen CR, Thakar J, Kleinstein SH. Quantitative set analysis for gene expression: a method to quantify gene set differential expression including gene-gene correlations. Nucleic acids research. 2013;41(18):e170. Epub 2013/08/08. pmid:23921631
View Article
PubMed/NCBI
Google Scholar

[61] View Article

[62] PubMed/NCBI

[63] Google Scholar

[ref17] 17. Huber W, Carey VJ, Gentleman R, Anders S, Carlson M, Carvalho BS, et al. Orchestrating high-throughput genomic analysis with Bioconductor. Nature methods. 2015;12(2):115–21. Epub 2015/01/31. pmid:25633503
View Article
PubMed/NCBI
Google Scholar

[65] View Article

[66] PubMed/NCBI

[67] Google Scholar

[ref18] 18. Turner JA, Bolen CR, Blankenship DM. Quantitative gene set analysis generalized for repeated measures, confounder adjustment, and continuous covariates. BMC bioinformatics. 2015;16:272. Epub 2015/09/01. pmid:26316107
View Article
PubMed/NCBI
Google Scholar

[69] View Article

[70] PubMed/NCBI

[71] Google Scholar

[ref19] 19. HIPC-CHI Signatures Project Team, HIPC-I Consortium. Multicohort analysis reveals baseline transcriptional predictors of influenza vaccination responses. Science immunology. 2017;2(14). Epub 2017/08/27. pmid:28842433.
View Article
PubMed/NCBI
Google Scholar

[73] View Article

[74] PubMed/NCBI

[75] Google Scholar

[ref20] 20. Yaari G, Uduman M, Kleinstein SH. Quantifying selection in high-throughput Immunoglobulin sequencing data sets. Nucleic acids research. 2012;40(17):e134. Epub 2012/05/30. pmid:22641856
View Article
PubMed/NCBI
Google Scholar

[77] View Article

[78] PubMed/NCBI

[79] Google Scholar

[ref21] 21. Thakar J, Mohanty S, West AP, Joshi SR, Ueda I, Wilson J, et al. Aging-dependent alterations in gene expression and a mitochondrial signature of responsiveness to human influenza vaccination. Aging. 2015;7(1):38–52. Epub 2015/01/19. pmid:25596819
View Article
PubMed/NCBI
Google Scholar

[81] View Article

[82] PubMed/NCBI

[83] Google Scholar

[ref22] 22. Tsang JS, Schwartzberg PL, Kotliarov Y, Biancotto A, Xie Z, Germain RN, et al. Global analyses of human immune variation reveal baseline predictors of postvaccination responses. Cell. 2014;157(2):499–513. Epub 2014/04/15. pmid:24725414
View Article
PubMed/NCBI
Google Scholar

[85] View Article

[86] PubMed/NCBI

[87] Google Scholar

[ref23] 23. Mosteller F, Fisher R. Questions and answers #14. The American Statistician. 1948;2(5):30–1.
View Article
Google Scholar

[89] View Article

[90] Google Scholar

[ref24] 24. Stouffer S, Suchman E, DeVinney L, Star S, Williams R Adjustment during Army Life. The American Soldier. 1949;1.

[ref25] 25. Viechtbauer W. Conducting Meta-Analyses in R with the metafor Package. J Stat Softw. 2010;36:1–48.
View Article
Google Scholar

[93] View Article

[94] Google Scholar

[ref26] 26. Li JaT G. An adaptively weighted statistic for detecting differential gene expression when combining multiple transcriptomic studies. The Annals of Applied Statistics. 2011;5:994–1019.
View Article
Google Scholar

[96] View Article

[97] Google Scholar

Abstract

Figures

Introduction

Design & implementation

Results

Availability and Future Directions

Supporting information

S1 Table. Nominal P values of gene sets significantly associated with successful influenza vaccination responses from four meta-analysis approaches.

References

Cookie Preference Center

Customize Your Cookie Preference