Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Jun;570(7762):514-518.
doi: 10.1038/s41586-019-1310-4. Epub 2019 Jun 19.

Genetic analyses of diverse populations improves discovery for complex traits

Genevieve L Wojcik  1 Mariaelisa Graff  2 Katherine K Nishimura  3 Ran Tao  4   5 Jeffrey Haessler  3 Christopher R Gignoux  1   6 Heather M Highland  2 Yesha M Patel  7 Elena P Sorokin  1 Christy L Avery  2 Gillian M Belbin  8   9 Stephanie A Bien  3 Iona Cheng  10 Sinead Cullina  8   9 Chani J Hodonsky  2 Yao Hu  3 Laura M Huckins  11 Janina Jeff  8   9 Anne E Justice  2 Jonathan M Kocarnik  3 Unhee Lim  12 Bridget M Lin  2 Yingchang Lu  9 Sarah C Nelson  13 Sung-Shim L Park  7 Hannah Poisner  8   9 Michael H Preuss  9 Melissa A Richard  14 Claudia Schurmann  9   15   16 Veronica W Setiawan  7 Alexandra Sockell  1 Karan Vahi  17 Marie Verbanck  9 Abhishek Vishnu  9 Ryan W Walker  9 Kristin L Young  2 Niha Zubair  3 Victor Acuña-Alonso  18 Jose Luis Ambite  17 Kathleen C Barnes  6 Eric Boerwinkle  19 Erwin P Bottinger  9   15   16 Carlos D Bustamante  1 Christian Caberto  12 Samuel Canizales-Quinteros  20 Matthew P Conomos  13 Ewa Deelman  17 Ron Do  9   11 Kimberly Doheny  21 Lindsay Fernández-Rhodes  2   22 Myriam Fornage  14 Benyam Hailu  23 Gerardo Heiss  2 Brenna M Henn  24 Lucia A Hindorff  25 Rebecca D Jackson  26 Cecelia A Laurie  13 Cathy C Laurie  13 Yuqing Li  10   27 Dan-Yu Lin  2 Andres Moreno-Estrada  28 Girish Nadkarni  9 Paul J Norman  6 Loreall C Pooler  7 Alexander P Reiner  13 Jane Romm  21 Chiara Sabatti  1 Karla Sandoval  28 Xin Sheng  7 Eli A Stahl  11 Daniel O Stram  7 Timothy A Thornton  13 Christina L Wassel  29 Lynne R Wilkens  12 Cheryl A Winkler  30 Sachi Yoneyama  2 Steven Buyske  31 Christopher A Haiman  32 Charles Kooperberg  3 Loic Le Marchand  12 Ruth J F Loos  9   11 Tara C Matise  33 Kari E North  2 Ulrike Peters  3 Eimear E Kenny  34   35   36   37 Christopher S Carlson  38
Affiliations

Genetic analyses of diverse populations improves discovery for complex traits

Genevieve L Wojcik et al. Nature. 2019 Jun.

Abstract

Genome-wide association studies (GWAS) have laid the foundation for investigations into the biology of complex traits, drug development and clinical guidelines. However, the majority of discovery efforts are based on data from populations of European ancestry1-3. In light of the differential genetic architecture that is known to exist between populations, bias in representation can exacerbate existing disease and healthcare disparities. Critical variants may be missed if they have a low frequency or are completely absent in European populations, especially as the field shifts its attention towards rare variants, which are more likely to be population-specific4-10. Additionally, effect sizes and their derived risk prediction scores derived in one population may not accurately extrapolate to other populations11,12. Here we demonstrate the value of diverse, multi-ethnic participants in large-scale genomic studies. The Population Architecture using Genomics and Epidemiology (PAGE) study conducted a GWAS of 26 clinical and behavioural phenotypes in 49,839 non-European individuals. Using strategies tailored for analysis of multi-ethnic and admixed populations, we describe a framework for analysing diverse populations, identify 27 novel loci and 38 secondary signals at known loci, as well as replicate 1,444 GWAS catalogue associations across these traits. Our data show evidence of effect-size heterogeneity across ancestries for published GWAS associations, substantial benefits for fine-mapping using diverse cohorts and insights into clinical implications. In the United States-where minority populations have a disproportionately higher burden of chronic conditions13-the lack of representation of diverse populations in genetic research will result in inequitable access to precision medicine for those with the highest burden of disease. We strongly advocate for continued, large genome-wide efforts in diverse populations to maximize genetic discovery and reduce health disparities.

PubMed Disclaimer

Figures

Extended Data Fig. 1 |
Extended Data Fig. 1 |. Number of unique participants in the GWAS Catalog from 2006 to 2017 (inclusive).
We observed that—although the number of unique participants (in millions) in the GWAS Catalog has grown substantially over the past decade—the relative proportion of participants of non-European descent has remained constant, with the majority of progress within Asian populations.
Extended Data Fig. 2 |
Extended Data Fig. 2 |. Correlation between SNP genotype and PC1-PC10.
a, The correlation (r2) for novel and residual loci calculated by obtaining the individual level data for all PAGE participants and correlating the SNP genotype with each of the ten PCs. The correlation between each locus and each of the ten PCs was plotted on the y axis, novel loci are plotted in grey and residual loci are plotted in yellow. We observed an especially high correlation between a novel locus and PC4, which represents Native Hawaiian/Pacific Islander ancestry. b, The individual level data for all PAGE participants were obtained and plotted in a parallel coordinates plot, such that each PAGE individual is represented by a set of line segments connecting their eigenvalues. This allows us to see which race/ethnicity groups are differentiated at each PC. For example, we see predominantly green lines as outliers for PC4, which indicates that this vector represents a continuum of Native Hawaiian/Pacific Islander ancestry.
Fig. 1 |
Fig. 1 |. Inclusion of multi-ethnic samples enables discovery and replication in GWAS.
a, The population substructure present in the multi-ethnic sample of PAGE (n = 49,839) revealed complex patterns preventing meaningful stratification. Here we show that PC1 and PC2 show major patterns of variation, stratified by self-identified race/ethnicity. Individuals denoted by orange self-identified as ‘Other’. b, There are 8,979 previously reported trait–variant pairs, of which 1,444 replicated at a by-trait Bonferroni-adjusted significance level for P values estimated from a Wald test in SUGEN. In addition, we found 27 novel trait–variant pairs and 38 secondary signal pairs that remained after adjusting for known variants. BMI, body-mass index; eGFR, estimated glomerular filtration rate; HbA1c, glycated haemoglobin; HDL, high-density lipoprotein; LDL, low-density lipoprotein; MCHC, mean corpuscular haemoglobin concentration; WHR, waist-to-hip ratio.
Fig. 2 |
Fig. 2 |. Weaker effect sizes of previously published trait–variant associations in non-European populations exacerbates disparity in PVE.
a, Standardized effect sizes for the two largest self-reported subsets of the PAGE population show markedly weaker effect sizes in African Americans (zPAGE = 0.54 × zprior (yellow); z′ is the z-score from the trait–variant association standardized by the sample size in PAGE or the ‘prior’ publication from the NHGRI-EBI GWAS Catalog) than in Hispanic/Latino participants (zPAGE = 0.86 × zprior; red) compared to originally reported effect sizes from the NHGRI-EBI GWAS Catalog. Grey shading indicates the 95% confidence interval around the slope estimate. b, After identifying the SNP with the smallest P value in each locus, the PVE of height was calculated using the estimated effect size from this set of tag SNPs (left, GIANT-only GWAS; middle, UKB50k+GIANT meta-analysis; right, PAGE + GIANT meta-analysis). PVE was estimated independently in the UKB50k (White British) and PAGE (multi-ethnic) samples. The gap in PVE with previously reported loci from GIANT (8.14%) is exacerbated with the inclusion of 50,000 more individuals of European descent, to 11.19%. However, it narrows markedly with the inclusion of 50,000 multiethnic samples, to 3.91%.
Fig. 3 |
Fig. 3 |. Fine-mapping with multi-ethnic PAGE versus homogeneous UK Biobank samples for height.
a, Comparison of 95% credible sets for height, comparing GIANT alone (n = 253,288) to UKB50k + GIANT (n = 303,288; paired-sample t-test P = 0.37) and PAGE + GIANT (n = 303,069; paired-sample t-test P = 0.01). Box plots show the median as the line in the notch, with the top and bottom of the box indicating the interquartile range. Whiskers extend to either the minimum value or 1.5 × the interquartile range. Notches indicate the 95% confidence interval of the medians. b, Top posterior probability from each 95% credible set for height, comparing GIANT (n = 253,288) to UKB50k + GIANT (n = 303,288) and PAGE + GIANT (n = 303,069). c, Example of results for a height locus from GWAS (rs11880992) in UKB50k + GIANT (n = 303,288) and PAGE + GIANT (n = 303,069), with linkage disequilibrium from weighted matrix from meta-analysis. d, Posterior probabilities for this signal with credible set in indicated by the diamond shapes. e, Linkage disequilibrium (r2) for the original 95% credible set from GIANT results stratified by populations. The index association SNP (rs11880992) with the highest posterior probability is denoted in bold.

Comment in

Similar articles

  • Ancestry-specific associations identified in genome-wide combined-phenotype study of red blood cell traits emphasize benefits of diversity in genomics.
    Hodonsky CJ, Baldassari AR, Bien SA, Raffield LM, Highland HM, Sitlani CM, Wojcik GL, Tao R, Graff M, Tang W, Thyagarajan B, Buyske S, Fornage M, Hindorff LA, Li Y, Lin D, Reiner AP, North KE, Loos RJF, Kooperberg C, Avery CL. Hodonsky CJ, et al. BMC Genomics. 2020 Mar 14;21(1):228. doi: 10.1186/s12864-020-6626-9. BMC Genomics. 2020. PMID: 32171239 Free PMC article.
  • Genome-wide Association Identifies Novel Etiological Insights Associated with Parkinson's Disease in African and African Admixed Populations.
    Rizig M, Bandres-Ciga S, Makarious MB, Ojo O, Crea PW, Abiodun O, Levine KS, Abubakar S, Achoru C, Vitale D, Adeniji O, Agabi O, Koretsky MJ, Agulanna U, Hall DA, Akinyemi R, Xie T, Ali M, Shamim EA, Ani-Osheku I, Padmanaban M, Arigbodi O, Standaert DG, Bello A, Dean M, Erameh C, Elsayed I, Farombi T, Okunoye O, Fawale M, Billingsley KJ, Imarhiagbe F, Jerez PA, Iwuozo E, Baker B, Komolafe M, Malik L, Nwani P, Daida K, Nwazor E, Miano-Burkhardt A, Nyandaiti Y, Fang ZH, Obiabo Y, Kluss JH, Odeniyi O, Hernandez D, Odiase F, Tayebi N, Ojini F, Sidranksy E, Onwuegbuzie G, D'Souza AM, Osaigbovo G, Berhe B, Osemwegie N, Reed X, Oshinaike O, Leonard H, Otubogun F, Alvarado CX, Oyakhire S, Ozomma S, Samuel S, Taiwo F, Wahab K, Zubair Y, Iwaki H, Kim JJ, Morris HR, Hardy J, Nalls M, Heilbron K, Norcliffe-Kaufmann L; Disease Research Network, International Parkinson’s Disease Genomics Consortium - Africa (IPDGC Africa), Black and African American Connections to Parkinson’s Disease (BLAAC PD) Study Group, the 23andMe Research Team; Blauwendraat C, Houlden H, Singleton A, Okubadejo N. Rizig M, et al. medRxiv [Preprint]. 2023 May 7:2023.05.05.23289529. doi: 10.1101/2023.05.05.23289529. medRxiv. 2023. Update in: Lancet Neurol. 2023 Nov;22(11):1015-1025. doi: 10.1016/S1474-4422(23)00283-1. PMID: 37398408 Free PMC article. Updated. Preprint.
  • Multi-trait GWAS for diverse ancestries: mapping the knowledge gap.
    Troubat L, Fettahoglu D, Henches L, Aschard H, Julienne H. Troubat L, et al. BMC Genomics. 2024 Apr 17;25(1):375. doi: 10.1186/s12864-024-10293-3. BMC Genomics. 2024. PMID: 38627641 Free PMC article.
  • Genome-Wide Association Studies of Cancer in Diverse Populations.
    Park SL, Cheng I, Haiman CA. Park SL, et al. Cancer Epidemiol Biomarkers Prev. 2018 Apr;27(4):405-417. doi: 10.1158/1055-9965.EPI-17-0169. Epub 2017 Jun 21. Cancer Epidemiol Biomarkers Prev. 2018. PMID: 28637795 Free PMC article. Review.
  • Genetics of Obesity in Diverse Populations.
    Young KL, Graff M, Fernandez-Rhodes L, North KE. Young KL, et al. Curr Diab Rep. 2018 Nov 19;18(12):145. doi: 10.1007/s11892-018-1107-0. Curr Diab Rep. 2018. PMID: 30456705 Free PMC article. Review.

Cited by

References

    1. Need AC & Goldstein DB Next generation disparities in human genomics: concerns and remedies. Trends Genet. 25, 489–494 (2009). - PubMed
    1. Bustamante CD, Burchard EG & De La Vega FM Genomics for the world. Nature 475, 163–165 (2011). - PMC - PubMed
    1. Popejoy AB & Fullerton SM Genomics is failing on diversity. Nature 538, 161–164 (2016). - PMC - PubMed
    1. Gravel S et al. Demographic history and rare allele sharing among human populations. Proc. Natl Acad. Sci. USA 108, 11983–11988 (2011). - PMC - PubMed
    1. The SIGMA Type 2 Diabetes Consortium. Association of a low-frequency variant in HNF1A with type 2 diabetes in a Latino population. J. Am. Med. Assoc 311, 2305–2314 (2014). - PMC - PubMed

Publication types

Grants and funding

-