Skip to main content
Advertisement
  • Loading metrics

GWAS of Follicular Lymphoma Reveals Allelic Heterogeneity at 6p21.32 and Suggests Shared Genetic Susceptibility with Diffuse Large B-cell Lymphoma

  • Karin E. Smedby ,

    Contributed equally to this work with: Karin E. Smedby, Jia Nee Foo

    Affiliation Department of Medicine, Clinical Epidemiology Unit, Karolinska Institutet, Stockholm, Sweden

  • Jia Nee Foo ,

    Contributed equally to this work with: Karin E. Smedby, Jia Nee Foo

    Affiliation Human Genetics, Genome Institute of Singapore, A*STAR, Singapore, Singapore

  • Christine F. Skibola ,

    ‡ These authors also contributed equally to this work.

    Affiliation Division of Environmental Health Sciences, School of Public Health, University of California Berkeley, Berkeley, California, United States of America

  • Hatef Darabi ,

    ‡ These authors also contributed equally to this work.

    Affiliation Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden

  • Lucia Conde,

    Affiliation Division of Environmental Health Sciences, School of Public Health, University of California Berkeley, Berkeley, California, United States of America

  • Henrik Hjalgrim,

    Affiliation Department of Epidemiology Research, Statens Serum Institut, Copenhagen, Denmark

  • Vikrant Kumar,

    Affiliation Human Genetics, Genome Institute of Singapore, A*STAR, Singapore, Singapore

  • Ellen T. Chang,

    Affiliations Cancer Prevention Institute of California, Fremont, California, United States of America, Division of Epidemiology, Department of Health Research and Policy, Stanford University School of Medicine, Stanford, California, United States of America

  • Nathaniel Rothman,

    Affiliation Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, Maryland, United States of America

  • James R. Cerhan,

    Affiliation College of Medicine, Mayo Clinic, Rochester, Minnesota, United States of America

  • Angela R. Brooks-Wilson,

    Affiliations British Columbia Cancer Research Center, British Columbia Cancer Agency, Vancouver, Canada, Department of Biomedical Physiology and Kinesiology, Simon Fraser University, Burnaby, Canada

  • Emil Rehnberg,

    Affiliation Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden

  • Ishak D. Irwan,

    Affiliation Human Genetics, Genome Institute of Singapore, A*STAR, Singapore, Singapore

  • Lars P. Ryder,

    Affiliation Department of Clinical Immunology, University Hospital of Copenhagen, Copenhagen, Denmark

  • Peter N. Brown,

    Affiliation Department of Haematology, Copenhagen University Hospital, Copenhagen, Denmark

  • Paige M. Bracci,

    Affiliation Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, California, United States of America

  • Luz Agana,

    Affiliation Division of Environmental Health Sciences, School of Public Health, University of California Berkeley, Berkeley, California, United States of America

  • Jacques Riby,

    Affiliation Division of Environmental Health Sciences, School of Public Health, University of California Berkeley, Berkeley, California, United States of America

  • Wendy Cozen,

    Affiliation Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, California, United States of America

  • Scott Davis,

    Affiliations Fred Hutchinson Cancer Research Center, Seattle, Washington, United States of America, University of Washington, Seattle, Washington, United States of America

  • Patricia Hartge,

    Affiliation Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, Maryland, United States of America

  • Lindsay M. Morton,

    Affiliation Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, Maryland, United States of America

  • Richard K. Severson,

    Affiliations Department of Family Medicine and Public Health Sciences, Wayne State University, Detroit, Michigan, United States of America, Karmanos Cancer Institute, Detroit, Michigan, United States of America

  • Sophia S. Wang,

    Affiliations Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, Maryland, United States of America, Division of Etiology, Beckman Research Institute and the City of Hope, Duarte, California, United States of America

  • Susan L. Slager,

    Affiliation College of Medicine, Mayo Clinic, Rochester, Minnesota, United States of America

  • Zachary S. Fredericksen,

    Affiliation College of Medicine, Mayo Clinic, Rochester, Minnesota, United States of America

  • Anne J. Novak,

    Affiliation College of Medicine, Mayo Clinic, Rochester, Minnesota, United States of America

  • Neil E. Kay,

    Affiliation College of Medicine, Mayo Clinic, Rochester, Minnesota, United States of America

  • Thomas M. Habermann,

    Affiliation College of Medicine, Mayo Clinic, Rochester, Minnesota, United States of America

  • Bruce Armstrong,

    Affiliation Sydney School of Public Health, The University of Sydney, Sydney, Australia

  • Anne Kricker,

    Affiliation Sydney School of Public Health, The University of Sydney, Sydney, Australia

  • Sam Milliken,

    Affiliation Department of Haematology, St. Vincent's Hospital, Sydney, Australia

  • Mark P. Purdue,

    Affiliation Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, Maryland, United States of America

  • Claire M. Vajdic,

    Affiliation University of New South Wales Cancer Research Center, Prince of Wales Clinical School, Sydney, Australia

  • Peter Boyle,

    Affiliation International Prevention Research Institute, Lyon, France

  • Qing Lan,

    Affiliation Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, Maryland, United States of America

  • Shelia H. Zahm,

    Affiliation Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, Maryland, United States of America

  • Yawei Zhang,

    Affiliation School of Public Health, Yale University, New Haven, Connecticut, United States of America

  • Tongzhang Zheng,

    Affiliation School of Public Health, Yale University, New Haven, Connecticut, United States of America

  • Stephen Leach,

    Affiliation Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, Canada

  • John J. Spinelli,

    Affiliations British Columbia Cancer Research Center, British Columbia Cancer Agency, Vancouver, Canada, School of Population and Public Health, University of British Columbia, Vancouver, Canada

  • Martyn T. Smith,

    Affiliation Division of Environmental Health Sciences, School of Public Health, University of California Berkeley, Berkeley, California, United States of America

  • Stephen J. Chanock,

    Affiliation Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, Maryland, United States of America

  • Leonid Padyukov,

    Affiliation Rheumatology Unit, Department of Medicine, Karolinska Institutet and Karolinska University Hospital Solna, Stockholm, Sweden

  • Lars Alfredsson,

    Affiliation Institute of Environmental Medicine, Karolinska Institutet, Stockholm, Sweden

  • Lars Klareskog,

    Affiliation Rheumatology Unit, Department of Medicine, Karolinska Institutet and Karolinska University Hospital Solna, Stockholm, Sweden

  • Bengt Glimelius,

    Affiliations Department of Pathology and Oncology, Karolinska Institutet, Stockholm, Sweden, Department of Radiology, Oncology, and Radiation Sciences, Uppsala University, Uppsala, Sweden

  • Mads Melbye,

    Affiliation Department of Epidemiology Research, Statens Serum Institut, Copenhagen, Denmark

  • Edison T. Liu,

    Affiliation Human Genetics, Genome Institute of Singapore, A*STAR, Singapore, Singapore

  • Hans-Olov Adami,

    Affiliations Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden, Department of Epidemiology, Harvard School of Public Health, Boston, Massachusetts, United States of America

  • Keith Humphreys ,

    These authors also contributed equally to this work.

    Affiliation Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden

  •  [ ... ],
  • Jianjun Liu

    liuj3@gis.a-star.edu.sg

    These authors also contributed equally to this work.

    Affiliation Human Genetics, Genome Institute of Singapore, A*STAR, Singapore, Singapore

  • [ view all ]
  • [ view less ]

Abstract

Non-Hodgkin lymphoma (NHL) represents a diverse group of hematological malignancies, of which follicular lymphoma (FL) is a prevalent subtype. A previous genome-wide association study has established a marker, rs10484561 in the human leukocyte antigen (HLA) class II region on 6p21.32 associated with increased FL risk. Here, in a three-stage genome-wide association study, starting with a genome-wide scan of 379 FL cases and 791 controls followed by validation in 1,049 cases and 5,790 controls, we identified a second independent FL–associated locus on 6p21.32, rs2647012 (ORcombined = 0.64, Pcombined = 2×10−21) located 962 bp away from rs10484561 (r2<0.1 in controls). After mutual adjustment, the associations at the two SNPs remained genome-wide significant (rs2647012:ORadjusted = 0.70, Padjusted = 4×10−12; rs10484561:ORadjusted = 1.64, Padjusted = 5×10−15). Haplotype and coalescence analyses indicated that rs2647012 arose on an evolutionarily distinct haplotype from that of rs10484561 and tags a novel allele with an opposite (protective) effect on FL risk. Moreover, in a follow-up analysis of the top 6 FL–associated SNPs in 4,449 cases of other NHL subtypes, rs10484561 was associated with risk of diffuse large B-cell lymphoma (ORcombined = 1.36, Pcombined = 1.4×10−7). Our results reveal the presence of allelic heterogeneity within the HLA class II region influencing FL susceptibility and indicate a possible shared genetic etiology with diffuse large B-cell lymphoma. These findings suggest that the HLA class II region plays a complex yet important role in NHL.

Author Summary

Earlier studies have established a marker rs10484561, in the HLA class II region on 6p21.32, associated with increased follicular lymphoma (FL) risk. Here, in a three-stage genome-wide association study of 1,428 FL cases and 6,581 controls, we identified a second independent FL–associated marker on 6p21.32, rs2647012, located 962 bp away from rs10484561. The associations at two SNPs remained genome-wide significant after mutual adjustment. Haplotype and coalescence analyses indicated that rs2647012 arose on an evolutionarily distinct lineage from that of rs10484561 and tags a novel allele with an opposite, protective effect on FL risk. Moreover, in an analysis of the top 6 FL–associated SNPs in 4,449 cases of other NHL subtypes, rs10484561 was associated with risk of diffuse large B-cell lymphoma. Our results reveal the presence of allelic heterogeneity at 6p21.32 in FL risk and suggest a shared genetic etiology with the common diffuse large B-cell lymphoma subtype.

Introduction

Non-Hodgkin lymphoma (NHL) represents a diverse group of B- and T-cell malignancies of lymphatic origin. The most common subtypes are of B-cell origin and are further classified on the basis of their resemblance to normal stages of B-cell differentiation [1]. Epidemiological studies indicate that these may have different environmental and genetic risk factors, although some etiological factors may also be shared [2]. Familial studies provide substantial evidence for a genetic influence on susceptibility to the major mature B-cell neoplasms, including diffuse large B-cell lymphoma (DLBCL), follicular lymphoma (FL) and chronic lymphocytic leukemia/small lymphocytic lymphoma (CLL/SLL) [3], [4]. Recent genome-wide association studies (GWAS) of the FL subtype of NHL identified associations with two variants within the human leukocyte antigen (HLA) region, one at 6p21.33 (rs6457327) [5] and the other at 6p21.32 (rs10484561) [6]. Additional true associations, particularly in the HLA region, may have been missed because a limited number of samples were used in the initial genome-wide screens, and the selection of a few top single nucleotide polymorphisms (SNPs) for validation is further subject to chance. In this study, we conducted a larger independent genome-wide scan of FL using 379 cases and 791 controls from the Scandinavian Lymphoma Etiology (SCALE) study of Sweden and Denmark, which was used in the validation of the previous GWAS [6]. This scan was followed by two stages of validation in European-ancestry cases of FL and other common B-cell NHL subtypes and controls from the US, Canada and Australia (Table 1, Table S1, Table S2, Figure 1).

thumbnail
Figure 1. Schematic representation of the three-stage study design.

Summary of contributing studies and number of samples per case/control status. Abbreviations: FL: follicular lymphoma, NHL: non-Hodgkin lymphoma, DLBCL: diffuse large B-cell lymphoma, CLL/SLL: chronic lymphocytic leukemia/small lymphocytic lymphoma, SNP: single nucleotide polymorphism, GWAS: genome-wide association study, SCALE: Scandinavian lymphoma etiology, SF: San Francisco, BC: British Columbia, NCI-SEER: National Cancer Institute-Surveillance, Epidemiology and End Results, NSW: New South Wales, Yale: Yale University, Mayo: Mayo Clinic. The complete list of the number of other NHL subtypes in each study is detailed in Table S1.

https://doi.org/10.1371/journal.pgen.1001378.g001

thumbnail
Table 1. Summary of contributing studies, genotyping methods, and number of samples per case/control status.

https://doi.org/10.1371/journal.pgen.1001378.t001

Results

In total, 298,168 SNPs were analyzed in Stage 1 (λ = 1.028; λ1000 = 1.055 [7]), in which we observed suggestive associations (adjusted trend P-value<10−5) at 4q32.3, 6p21.32 and 10q25.3 (Table S3) with the strongest at rs2647012 (odds ratio (OR) = 0.58, PPCAadjusted =  1.59x10−7) within the HLA class II region on 6p21.32. Sixteen SNPs in close proximity to the HLA-DQ genes showed association with adjusted P-values<10−4, including the previously reported rs10484561 (Figure 2, Table S4) [6]. The previously reported HLA class I associated SNP rs6457327 [5] was modestly associated with FL risk (OR = 0.82, P = 0.03) in Stage 1, and was not in linkage disequilibrium (LD; r2 = 0) with any of the top 100 SNPs.

In Stage 2, we carried out an in silico validation of the top 40 SNPs from Stage 1 (Table S5) in 213 FL cases and 750 controls from the San Francisco Bay Area, USA (Table 1), the study that reported an association at 6p21.32 [6]. Among 38 out of 40 SNPs, seven showed association (P<0.05) in Stage 2 (Table S5), six of which were located within the 6p21.32 region. We tested the independence of multiple association signals in 6p21.32 using a stepwise logistic regression analysis (entering SNPs based on a criterion of likelihood ratio test p-value<0.05) and found that with rs2647012 (the top SNP within the region) forced in the model, only the addition of rs10484561 contributed significantly to the association with increased risk of FL. The OR for this SNP, adjusted for rs2647012, was 1.43, P = 0.006 (Table S6).

thumbnail
Figure 2. Recombination plot showing associations in 6p21.32 in Stage 1.

Plot showing the pattern of associations in Stage 1, the recombination rate (build 36, HapMap CEU) and genes located in the region. The two SNPs showing independent association and their respective P-values are labeled (blue: rs2647012, green: rs10484561); other SNPs are color-coded according to their LD with rs2647012 (red r2>0.8, orange 0.5–0.8, grey 0.2–0.5, white <0.2).

https://doi.org/10.1371/journal.pgen.1001378.g002

After excluding previously identified and non-independent association signals, we selected rs2647012, and an additional four top SNPs to be taken forward to a third stage (Table S7, S8), wherein these were genotyped in 836 FL cases and 3202 controls from the Mayo Clinic (US) [8], National Cancer Institute-Surveillance, Epidemiology and End Results (NCI-SEER, US) [9], Yale University (US) [10], New South Wales (NSW, Australia) [11] and British Columbia (BC, Canada) [12] studies. The association of rs2647012 with FL was validated, showing consistent associations with similar ORs (no heterogeneity, P = 0.32) across all independent studies and reaching genome-wide significance in both the combined analysis of the validation samples (P = 3×10−15) and the combined analysis of all three stages (1428 FL cases, 4743 controls; OR = 0.64, P = 2×10−21) (Table.2, Figure 3). After adjustment for rs10484561, the association at rs2647012 remained genome-wide significant with minimal change in magnitude (ORadjusted = 0.70, Padjusted = 4×10−12). The LD between the two SNPs is low (r2<0.1 in the SCALE controls and HapMap CEU [Utah residents with northern and western European ancestry] samples release27). Taken together, our results suggest that the association at rs2647012 is independent from rs10484561, and tags a different disease-predisposing variant. We also found suggestive evidence for an association at rs6536942 on 4q32.3 (OR = 1.36, P = 2×10−5) (Table 2, Figure S1A).

thumbnail
Figure 3. Forest plots of main associations with risk of FL.

Forest plots showing the associations in each study (ORs and P-values) at rs2647012 before adjustment (Pheterogeneity = 0.32), and at rs2647012 (Pheterogeneity = 0.67) and rs10484561 (Pheterogeneity = 0.54) after mutual adjustment. Squares indicate the odds ratios, with the size proportional to the weight of the study in the meta-analysis. Abbreviations: CI: confidence interval, SCALE: Scandinavian lymphoma etiology, SF: San Francisco, BC: British Columbia, NCI: National Cancer Institute-Surveillance, Epidemiology and End Results, NSW: New South Wales, YALE: Yale University, MAYO: Mayo Clinic.

https://doi.org/10.1371/journal.pgen.1001378.g003

thumbnail
Table 2. Summary of main findings in genome-wide association study (GWAS) and validation stages in risk* of follicular lymphoma (FL), diffuse large B-cell lymphoma (DLBCL), or marginal zone lymphoma (MZL), per study and combined.

https://doi.org/10.1371/journal.pgen.1001378.t002

To fine-map the association signals in the HLA class II region, we imputed 10,639 SNPs within 600 kb surrounding the top SNP rs2647012 using data from the 1000 Genomes (1000G, 60 CEU subjects, August 2009) and HapMap projects (HapMapII release 22, CEU) in Stage 1. Among the imputed SNPs, 258 SNPs located in a strong LD block of 236 kb (r2>0.8) showed stronger evidence of association than all the genotyped SNPs within the region (Figure S2). Since a moderate discordance of reference genotypes was observed between 1000 G and HapMapII, we analyzed only SNPs showing a concordance of >95% in the two datasets and identified the strongest association at rs9378212 (OR = 1.66, P = 3.21×10−8), located 219 kb upstream of rs2647012 (r2 = 0.56 in controls). We subsequently confirmed the imputed genotypes by Taqman genotyping in 345 of the FL case subjects used in Stage 1 and found a 99.4% concordance with the imputed genotypes, demonstrating high confidence in the results of the imputation.

Next, we performed a haplotype analysis using rs2647012, rs10484561 and an additional 12 adjacent genotyped SNPs located within a block of minimal recombination. Out of the eight haplotypes identified, three were neutral (OR = 0.9–1.1), three increased risk (ORs>1.2; strongest risk haplotype tagged by rs10484561) and two were protective (OR≤0.8; both tagged by rs2647012) (Table S9), suggesting the presence of at least two susceptibility alleles within the region. Coalescence analysis of the eight haplotypes indicated that rs2647012 and rs10484561 arose on two distal branches of the ancestral recombination graph [13] (Figure S3), which was also supported by the analysis of median-joining network [14] using seven SNPs without any recombination (Figure 4). Further haplotype analysis of the seven genotyped SNPs (Table S9) and the imputed SNP rs9378212 indicated that the two alleles of rs9378212 tag the two different evolutionary lineages (Figure 4), each harboring either rs2647012 or rs10484561. Thus, the associations at the two SNPs are likely due to two distinct susceptibility variants, instead of a single risk allele, that arose independently on different haplotype backgrounds.

thumbnail
Figure 4. Coalescence analysis of rs2647012 and rs10484561.

Median-joining network [14] of haplotypes constructed using seven SNPs (Table S9). Circles represent haplotypes with area proportional to their frequency. SNPs are shown on the links (black lines). SNPs and haplotypes associated with increased or decreased FL risk are labeled in red or blue, respectively. The percentage of alleles of the imputed SNP rs9378212 (C/T) phased on each haplotype are shown in bold.

https://doi.org/10.1371/journal.pgen.1001378.g004

The FL-associated SNP, rs10484561, was previously found to tag the extended haplotype HLA-DQA1*0101-HLA-DQB1*0501-HLA-DRB1*0101 [6]. Here, to test whether any HLA class II alleles may also be responsible for the observed association at rs2647012, we imputed known HLA tag SNPs [15], [16] using data from the 1000G and HapMapII European datasets. We confirmed the association of the HLA-DRB1*0101-HLA-DQA1*0101-HLA-DQB1*0501 extended haplotype, tagged by rs10484561. The association at rs2647012 remained significant after adjustment for these three HLA alleles (OR = 0.64, P = 8.11×10−6), suggesting that these are not driving the association at rs2647012. Furthermore, rs2647012 was not in strong LD (r2<0.8 in HapMap CEU or SCALE controls) with any other known HLA tags [15], including those tagging FL-associated alleles previously reported [17], [18] (r2<0.39 with the six HLA-DRB1*13 tag SNPs [rs2395173, rs2157051, rs4434496, rs6901541, rs424232, rs2050191] [17] and r2<0.25 with the three HLA-B*0801 and HLA-DRB*0301 tag SNPs [rs6457374, rs2844535, rs2040410] [15]). Of the other 17 HLA class II alleles (∼39% of all the class II alleles) that could be imputed, none showed significant association or were found to be responsible for the association at rs2647012 (Table S10). Detailed HLA allelotyping on large numbers of cases and controls is needed to determine if particular HLA class II alleles are responsible for the observed association at rs2647012.

To assess whether the FL-associated SNPs may be involved in the development of other NHL subtypes, we genotyped the five SNPs selected for Stage 3 together with rs10484561 in a total of 1592 DLBCL, 1075 CLL/SLL, 336 marginal zone lymphoma (MZL), 262 mantle cell lymphoma, 306 T-cell lymphoma and 878 rare or unspecified NHL cases and 5220 controls from the SCALE2, SF2, BC, Mayo, NCI-SEER, Yale and NSW studies (Table 1, Table S1, Figure 1). Among these SNPs, rs10484561 showed evidence of association with DLBCL (OR = 1.36, P = 1.41×10−7) (Figure S1B) and all NHL (OR = 1.23, P = 6.81×10−7). ORs were consistent across the seven studies. There was also a suggestive association for rs2647012 with MZL (OR = 1.32, P = 6.34×10−4) (Table.3), consistent across six studies.

Finally, we investigated the possibility of additional susceptibility loci for FL outside of the HLA region by performing a joint analysis of the top 41 to 1000 variants of our scan and the previously published GWAS of follicular lymphoma [6]. From this combined analysis, we did not find any additional markers with a strong association (P<10−6) with FL that were not in LD with our top 5 markers taken forward to stage 3 (data not shown).

thumbnail
Table 3. Meta-analysis of associations between rs1048456, the top 5 markers, and non-Hodgkin lymphoma (NHL) subtypes (including follicular lymphoma [FL], diffuse large B-cell lymphoma [DLBCL], chronic lymphocytic leukemia/small lymphocytic lymphoma [CLL/SLL], and others) and overall (All NHL).

https://doi.org/10.1371/journal.pgen.1001378.t003

Discussion

Through the identification of a second variant, rs2647012, that is independent of the previously identified risk variant rs10484561 [6] within the 6p21.32 region, our findings substantiate a major link between HLA class II loci and genetic susceptibility to FL. In addition, our study revealed evidence that rs10484561 is associated with DLBCL risk suggesting some shared biological mechanisms of susceptibility between these two common NHL subtypes. The association of rs2647012 with FL risk was not detected in earlier GWAS studies [5], [6], and that of rs10484561 with DLBCL risk previously reported was only marginal [6], perhaps because of the smaller sample sizes in Stage 1. The number of FL cases scanned in this study was almost double compared to the previous individual GWAS [6].

HLA class II molecules are expressed in antigen presenting cells such as B-lymphocytes, and act to present exogenous antigens to CD4+ helper T-cells. Efficiency of antigen presentation may influence lymphomagenesis through effects on anti-tumor immunity or on immune response to infections that are directly or indirectly oncogenic (e.g., through viral genome insertion or nonspecific chronic antigenic stimulation) [19]. Allelic variants in coding regions may affect the structure of the peptide binding groove of the class II molecules, leading to differences in the efficiency of oncogenic peptide binding or T-cell recognition. Coding sequence variation in the molecules encoded by the extended HLA-DRB1*0101-HLA-DQA1*0101-HLA-DQB1*0501 haplotype may be responsible for the association at rs10484561 [6].

Alternatively, variants in the regulatory sequences may influence the expression level of the HLA molecules and consequently the efficiency of antigen presentation. We note that rs2647012 is strongly associated with the average expression levels of HLA-DRB4 (β = 0.78, P = 3.4×10-22) and HLA-DQA1 (β = -0.58, P = 5.1×10−13) probes in Epstein-Barr virus-transfected lymphoblastoid cell lines (mRNA by SNP browser) [20], and rs10484561 is also associated with the expression levels of HLA-DQA1 probes (β = -0.884, P = 1.6×10−10). We speculate that this may be an alternative mechanism underlying the observed associations, especially at rs2647012.

Interestingly, SNPs within the same LD block harboring rs2647012 (r2>0.7 in HapMap CEU) have previously been associated with rheumatoid arthritis with the same direction of effect [21]. Since autoimmune disorders such as rheumatoid arthritis and Sjögren syndrome are associated with increased risk of NHL, in particular with DLBCL but also with FL [22], our finding may suggest a molecular link between these diseases, although their associations within this region of high LD could also be due to different causal variants.

Previously, large-scale candidate gene studies have pointed to susceptibility loci in the HLA class III region mainly between the TNF variant 308G->A (rs1800629) and risk of DLBCL [23], [24]. We provide novel evidence of association of DLBCL with an independent HLA marker in the class II region (rs10484561; r2 = 0), 1.1Mb away from rs1800629, strongly suggesting that alleles in the HLA class II region may play an important role in the pathogenesis of this subtype as well. The weaker association of rs10484561 with DLBCL (OR 1.36) than with FL (OR 1.95) [6] could imply that the DLBCL-association is confined to a subset of DLBCL tumors with specific morphological or molecular features more closely related to FL, such as the germinal center-like B-cell phenotype [25]. However, the observed effects could also be due to modification of other concurrent DLBCL-specific susceptibility variants, or rs10484561 could tag a more strongly associated marker in this region of high LD.

Moreover, we found suggestive evidence of association at rs6536942 on 4q32.3, located within an intron of the tolloid-like 1 (TLL1) gene, with FL risk. However, larger studies are needed to validate this finding. Although the strongest associations so far have been observed in the HLA region, and extended pooling of available scan data failed to identify additional loci outside of HLA, we expect that future larger meta-GWAS efforts will more robustly identify additional loci in other regions.

In conclusion, our results strongly suggest that future genetic and functional work focused on the HLA class II region will provide important insight into the disease pathology of FL, DLBCL and other subtypes of NHL. In addition, further studies of this region and potential interaction with environmental factors in NHL risk, and of NHL prognosis are warranted.

Methods

Ethics statement

The studies described in this manuscript have been approved by the ethics committee of the respective institutions: Karolinska Institutet (Sweden), Scientific Ethics Committee system (Denmark), University of California, Berkeley (US), National Cancer Institute, National Institutes of Health (US), Mayo Clinic (US), University of British Columbia (Canada), Yale University (US), University of Sydney (Australia).

Study subjects

The SCALE study is a population-based study of the etiology of NHL carried out in all of Denmark and Sweden during 1999 to 2002 [26]. NHL subtype diagnoses were reviewed and reclassified according to the World Health Organization (WHO) classification [1] as previously described [26]. For this GWAS (SCALE1) we used DNA from 400 cases with follicular lymphoma (FL; 150 from Denmark and 250 from Sweden) and from 150 Danish controls, individually matched to the Danish FL cases by sex and age at study inclusion. We also used material collected from 673 control subjects in a separate Swedish population-based case-control study of rheumatoid arthritis (the Eira study) [21], [27]. The latter was conducted during 1996 to 2005 among residents 18 to 70 years of age in the southern and central parts of Sweden (including 90% of Swedish residents). Hence, the population controls recruited in this study were considered to represent the same study population as the Swedish component of the SCALE study with regard to genetic variation. Genotyping completion rates were similar between cases and controls; out of 400 cases and 823 controls genotyped, 379 cases (95%) and 791 controls (96%) were included in the final analysis. Study subjects used in Stages 2, 3 and validation in other NHL subtypes (Table 1, Table S1, S2) have been previously described [6], [8][12], and details are available as supporting text (Text S1). For the SCALE2 NHL subtype validation study, we used the rest of the lymphoma cases with blood samples originally recruited in SCALE (n = 1869), Danish control subjects not included in the GWAS (n = 556), a second set of control subjects from the Eira study (n = 742) and a third group of controls recruited in a national population-based case-control study of breast cancer, the Cancer and Hormones Replacement in Sweden (CAHRES) study [28] (n = 720). The control subjects from this study were randomly selected from the Swedish general population to match the expected age distribution of the participating breast cancer cases (50 to 74 years).

Genotyping

Stage I genotyping of 317,503 single nucleotide polymorphisms (SNPs) was done on the HumanHap300 (version 1.0) array. Validation genotyping was done using Sequenom iPlex; SNPs in the human leukocyte antigen (HLA) region that failed primer design for Sequenom assays were genotyped using Taqman (Applied Biosystems).

Genome-wide association study

The scan included 317,503 SNPs from the HumanHap300 (version 1.0) array. The datasets were filtered on the basis of SNP genotyping call rates (≥>95% completeness), sample completion rate (≥90%), minor allele frequency (MAF; all subjects as well as cases and controls separately ≥0.03) and non-deviation from Hardy-Weinberg equilibrium (HWE; p<10−6). We also excluded SNPs with cluster plot problems, and those on the X and Y chromosomes. Study subjects with gender discrepancies and/or labelling errors were removed. We also removed individual samples with evidence of cryptic family relationships (identified using the–genome command in PLINK). To detect outliers in terms of population stratification, we performed principal component (PC) analysis using the EIGENSTRAT software (Figure S4). A subset of linkage disequilibrium (LD) thinned SNPs was selected such that all pair-wise associations had r2<0.2, and long-range regions of high LD, reported to potentially confound genome scans, were removed [29]. Twenty-five samples were removed as population outliers on the basis of their values on the first three PCs. To adjust for possible stratification in our association analyses we adjusted the regression analyses using the first three PCs; the number of PCs used for adjustment was determined by plotting the eigenvalues and locating the position of the “elbow” on the scree plot (Figure S5). Wald tests, treating minor allele counts as continuous covariates were used to test for association. The genomic inflation factor (λ) was calculated to be 1.0283 after adjusting for the first three PCs, suggesting the presence of minimal stratification. Quantile-quantile plots for the associations before and after adjustment are shown in Figure S6. Finally, we assessed associations of age and sex with main genotypes among the control subjects to address the possibility of confounding by these factors (Table S11). As there was no evidence of associations of age or sex with genotypes among the controls, we did not adjust for them in the final main effects analyses of genotypes.

Validation and meta-analysis

In Stage 2, similar quality control measures were applied as in Stage 1, including genotyping call rate ≥95%, sample completion rate ≥90%, and MAF ≥0.05. We tested each validation study for association using trend tests. For meta-analyses across studies and NHL subtypes, we used the Cochran-Mantel-Haenszel method to calculate the combined odds ratio and P-value, and χ2 tests for heterogeneity. Multivariate logistic regression was used to test for independence of SNP effects. For validation among other NHL subtypes, the control subjects were the same as those in Stages 2 and 3 for validation in FL for all studies except SCALE2. Only European-ancestry subjects were included, and the possibility of population stratification affecting the results has been thoroughly explored and found to be low in earlier investigations in the same populations [6], [8].

Imputation

We used IMPUTEv1 for the imputation of SNPs from the 1000 Genomes pilot1 CEU data (August 2009 release); and the HapMap Phase II release 22 CEU data. We set a strict threshold for imputation, using only SNPs with confidence scores of ≥0.9, call rates ≥90%, non-deviation from Hardy-Weinberg equilibrium P >0.001 and MAF >0.01. The imputation was done on the Stage 1 samples separately for each of the two reference datasets and SNPs showing a discordance of >5% between the genotypes imputed with the two datasets were excluded from further analysis. The data were then merged using HapMap II as the master dataset to which additional imputed SNPs from the 1000 Genomes dataset were added. HLA alleles were imputed by identifying tag SNPs [15] from the genotyped and imputed SNP dataset. We used PLINK for haplotype imputation with the tag SNPs and downstream association analyses. Only haplotypes with call rates >90%, MAF>1% and probability thresholds >0.8 were analyzed.

Haplotype and coalescence analyses

For coalescence analysis all 12 SNPs (genotyped in this study and within a region of ∼177 Kb) adjacent to the two SNPs associated with the FL risk were used to construct haplotypes. These were phased using the PHASE program [30] and tested for association using PLINK. The ancestral haplotype was constructed from the chimpanzee (PanTro2) allele whenever possible, and otherwise from the macaque alleles. An ancestral recombination graph was constructed using the program Beagle [13], [31] which allows recombination assuming an infinite site mutation model. After identifying the first recombination event the haplotype segment before the recombination spot was used to construct a median –joining network using the Network program [14]. The alleles of the imputed SNP rs9378212 were then phased on each haplotype segment using the PHASE program.

The URLs for the data and analytic approaches presented herein are as follows:

1000 Genomes http://1000genomes.org

HapMapII http://www.hapmap.org

IMPUTEv1 https://mathgen.stats.ox.ac.uk/impute/impute_v1.html

mRNA by SNP browser http://www.sph.umich.edu/csg/liang/asthma/

R script for recombination plot http://www.broadinstitute.org/science/projects/diabetes-genetics-initiative/plotting-genome-wide-association-results

Supporting Information

Figure S1.

Forest plots of main associations with risk of follicular lymphoma (FL) and diffuse large B-cell lymphoma (DLBCL).

https://doi.org/10.1371/journal.pgen.1001378.s001

(0.18 MB PDF)

Figure S2.

Association results for imputed SNPs and genotyped SNPs.

https://doi.org/10.1371/journal.pgen.1001378.s002

(0.03 MB PDF)

Figure S3.

Ancestral reconstruction graph based on the 14 SNPs in the Stage 1 samples.

https://doi.org/10.1371/journal.pgen.1001378.s003

(0.05 MB PDF)

Figure S4.

Testing of population structure using principal components analysis.

https://doi.org/10.1371/journal.pgen.1001378.s004

(0.11 MB PDF)

Figure S5.

Principal components analysis scree plot.

https://doi.org/10.1371/journal.pgen.1001378.s005

(0.02 MB PDF)

Figure S6.

Quantile-quantile plots before and after genomic control correction.

https://doi.org/10.1371/journal.pgen.1001378.s006

(0.07 MB PDF)

Table S1.

Number of patients with Non-Hodgkin lymphoma subtypes other than follicular lymphoma.

https://doi.org/10.1371/journal.pgen.1001378.s007

(0.01 MB PDF)

Table S2.

Overlap of samples from the current genome-wide association study and the previous GWAS reporting association between 6p21.32 and follicular lymphoma risk.

https://doi.org/10.1371/journal.pgen.1001378.s008

(0.01 MB PDF)

Table S3.

Top 40 SNPs taken forward to Stage 2, sorted by significance level (trend P-value) of association with risk of follicular lymphoma.

https://doi.org/10.1371/journal.pgen.1001378.s009

(0.02 MB PDF)

Table S4.

SNPs on chromosome 6p21.32 that showed genome-wide per allele P-values < 1E-04 in association with risk of follicular lymphoma in Stage 1, sorted by position.

https://doi.org/10.1371/journal.pgen.1001378.s010

(0.01 MB PDF)

Table S5.

Summary statistics for associations with risk of follicular lymphoma in Stages 1 and 2 with combined P-values.

https://doi.org/10.1371/journal.pgen.1001378.s011

(0.02 MB PDF)

Table S6.

Crude and adjusted logistic regression analyses of the six SNPs in 6p21.32 showing significant association with risk of follicular lymphoma in Stages 1 and 2.

https://doi.org/10.1371/journal.pgen.1001378.s012

(0.01 MB PDF)

Table S7.

Individual study results for associations between the 5 SNPs taken forward to Stage 3 and risk of follicular lymphoma in Stage 3.

https://doi.org/10.1371/journal.pgen.1001378.s013

(0.01 MB PDF)

Table S8.

Genotype counts of main SNPs per Cases/Controls, per study and in total.

https://doi.org/10.1371/journal.pgen.1001378.s014

(0.01 MB PDF)

Table S9.

Associations with risk of follicular lymphoma for haplotypes phased with 14 SNPs or 7 SNPs based on genotyped SNPs in Stage 1.

https://doi.org/10.1371/journal.pgen.1001378.s015

(0.01 MB PDF)

Table S10.

Imputation of HLA class II alleles and risk of follicular lymphoma.

https://doi.org/10.1371/journal.pgen.1001378.s016

(0.01 MB PDF)

Table S11.

Trend p-value of associations of age and sex with main genotypes among controls subjects per study.

https://doi.org/10.1371/journal.pgen.1001378.s017

(0.02 MB PDF)

Text S1.

Additional description of validation study subjects.

https://doi.org/10.1371/journal.pgen.1001378.s018

(0.04 MB PDF)

Acknowledgments

We are grateful to X. Y. Chen, H. B. Toh, K. K. Heng, W. Y. Meah, C. H. Wong, and H. Q. Low from the Genome Institute of Singapore for their support in genotyping analyses and data processing for the SCALE study. The control samples from the CAHRES study were provided by Per Hall and Kamila Czene.

Author Contributions

Conceived and designed the experiments: KES ETL HOA JL. Performed the experiments: IDI. Analyzed the data: KES JNF CFS HD LC VK ER KH JL. Contributed reagents/materials/analysis tools: KES CFS LC HH ETC NR JRC LPR PNB PMB LA JR WC SD PH LMM RKS SSW SLS ZSF AJN NEK TMH BA AK SM MPP CMV PB QL SHZ YZ TZ SL JJS MTS SJC LP LA LK BG MM ETL HOA JL. Wrote the paper: KES JNF KH JL. Critical revision of manuscript: CFS HD LC VK ETC IDI SJC. Genotyped samples and provided data from the BC study: ARBW. Provided data from the SF1 and SF2 GWAS studies: JR. Supervised the experiments: JL JNF.

References

  1. 1. Jaffe ES, Harris NL, Stein H, Vardiman J (2001) World Health Organization classification of tumours pathology and genetics, tumours of hematopoietic and lymphoid tissues. Lyon: IARC Press.
  2. 2. Morton LM, Wang SS, Cozen W, Linet MS, Chatterjee N, et al. (2008) Etiologic heterogeneity among non-Hodgkin lymphoma subtypes. Blood 112: 5150–5160.
  3. 3. Altieri A, Bermejo JL, Hemminki K (2005) Familial risk for non-Hodgkin lymphoma and other lymphoproliferative malignancies by histopathologic subtype: the Swedish Family-Cancer Database. Blood 106: 668–672.
  4. 4. Chang ET, Smedby KE, Hjalgrim H, Glimelius B, Adami HO (2006) Reliability of self-reported family history of cancer in a large case-control study of lymphoma. J Natl Cancer Inst 98: 61–68.
  5. 5. Skibola CF, Bracci PM, Halperin E, Conde L, Craig DW, et al. (2009) Genetic variants at 6p21.33 are associated with susceptibility to follicular lymphoma. Nat Genet 41: 873–875.
  6. 6. Conde L, Halperin E, Brown KM, Smedby KE, Rothman N, et al. (2010) Genome-wide association study of follicular lymphoma identifies a risk locus at 6p21.32. Nat Genet 42: 661–664.
  7. 7. de Bakker PI, Ferreira MA, Jia X, Neale BM, Raychaudhuri S, et al. (2008) Practical aspects of imputation-driven meta-analysis of genome-wide association studies. Hum Mol Genet 17: R122–128.
  8. 8. Cerhan JR, Ansell SM, Fredericksen ZS, Kay NE, Liebow M, et al. (2007) Genetic variation in 1253 immune and inflammation genes and risk of non-Hodgkin lymphoma. Blood 110: 4455–4463.
  9. 9. Wang SS, Cerhan JR, Hartge P, Davis S, Cozen W, et al. (2006) Common genetic variants in proinflammatory and other immunoregulatory genes and risk for non-Hodgkin lymphoma. Cancer Res 66: 9771–9780.
  10. 10. Zhang Y, Holford TR, Leaderer B, Boyle P, Zahm SH, et al. (2004) Hair-coloring product use and risk of non-Hodgkin's lymphoma: a population-based case-control study in Connecticut. Am J Epidemiol 159: 148–154.
  11. 11. Hughes AM, Armstrong BK, Vajdic CM, Turner J, Grulich A, et al. (2004) Pigmentary characteristics, sun sensitivity and non-Hodgkin lymphoma. Int J Cancer 110: 429–434.
  12. 12. Spinelli JJ, Ng CH, Weber JP, Connors JM, Gascoyne RD, et al. (2007) Organochlorines and risk of non-Hodgkin lymphoma. Int J Cancer 121: 2767–2775.
  13. 13. Song YS, Hein J (2005) Constructing minimal ancestral recombination graphs. J Comput Biol 12: 147–169.
  14. 14. Bandelt HJ, Forster P, Rohl A (1999) Median-joining networks for inferring intraspecific phylogenies. Mol Biol Evol 16: 37–48.
  15. 15. de Bakker PI, McVean G, Sabeti PC, Miretti MM, Green T, et al. (2006) A high-resolution HLA and SNP haplotype map for disease association studies in the extended human MHC. Nat Genet 38: 1166–1172.
  16. 16. Leslie S, Donnelly P, McVean G (2008) A statistical method for predicting classical HLA alleles from SNP data. Am J Hum Genet 82: 48–56.
  17. 17. Wang SS, Abdou AM, Morton LM, Thomas R, Cerhan JR, et al. (2010) Human leukocyte antigen class I and II alleles in non-Hodgkin lymphoma etiology. Blood 115: 4820–4823.
  18. 18. Abdou AM, Gao X, Cozen W, Cerhan JR, Rothman N, et al. (2010) Human leukocyte antigen (HLA) A1-B8-DR3 (8.1) haplotype, tumor necrosis factor (TNF) G-308A, and risk of non-Hodgkin lymphoma. Leukemia 24: 1055–1058.
  19. 19. Bateman AC, Howell WM (1999) Human leukocyte antigens and cancer: is it in our genes? J Pathol 188: 231–236.
  20. 20. Dixon AL, Liang L, Moffatt MF, Chen W, Heath S, et al. (2007) A genome-wide association study of global gene expression. Nat Genet 39: 1202–1207.
  21. 21. Plenge RM, Seielstad M, Padyukov L, Lee AT, Remmers EF, et al. (2007) TRAF1-C5 as a risk locus for rheumatoid arthritis–a genomewide study. N Engl J Med 357: 1199–1209.
  22. 22. Baecklund E, Backlin C, Iliadou A, Granath F, Ekbom A, et al. (2006) Characteristics of diffuse large B cell lymphomas in rheumatoid arthritis. Arthritis Rheum 54: 3774–3781.
  23. 23. Rothman N, Skibola CF, Wang SS, Morgan G, Lan Q, et al. (2006) Genetic variation in TNF and IL10 and risk of non-Hodgkin lymphoma: a report from the InterLymph Consortium. Lancet Oncol 7: 27–38.
  24. 24. Skibola CF, Bracci PM, Nieters A, Brooks-Wilson A, de Sanjose S, et al. (2010) Tumor necrosis factor (TNF) and lymphotoxin-alpha (LTA) polymorphisms and risk of non-Hodgkin lymphoma in the InterLymph Consortium. Am J Epidemiol 171: 267–276.
  25. 25. Alizadeh AA, Eisen MB, Davis RE, Ma C, Lossos IS, et al. (2000) Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature 403: 503–511.
  26. 26. Smedby KE, Hjalgrim H, Melbye M, Torrang A, Rostgaard K, et al. (2005) Ultraviolet radiation exposure and risk of malignant lymphomas. J Natl Cancer Inst 97: 199–209.
  27. 27. Plenge RM, Padyukov L, Remmers EF, Purcell S, Lee AT, et al. (2005) Replication of putative candidate-gene associations with rheumatoid arthritis in >4,000 samples from North America and Sweden: association of susceptibility with PTPN22, CTLA4, and PADI4. Am J Hum Genet 77: 1044–1060.
  28. 28. Magnusson C, Baron J, Persson I, Wolk A, Bergstrom R, et al. (1998) Body size in different periods of life and breast cancer risk in post-menopausal women. Int J Cancer 76: 29–34.
  29. 29. Price AL, Weale ME, Patterson N, Myers SR, Need AC, et al. (2008) Long-range LD can confound genome scans in admixed populations. Am J Hum Genet 83: 132–135; author reply 135-139.
  30. 30. Stephens M, Smith NJ, Donnelly P (2001) A new statistical method for haplotype reconstruction from population data. Am J Hum Genet 68: 978–989.
  31. 31. Lyngsø R, Song Y, Hein J (2005) Minimum recombination histories by branch and bound, proceedings of workshop on algorithms in bioinformatics. Lect Notes Comput Sci 3692: 239–250.