Analysis of family- and population-based samples in cohort genome-wide association studies

Manichaikul, Ani; Chen, Wei-Min; Williams, Kayleen; Wong, Quenna; Sale, Michèle M.; Pankow, James S.; Tsai, Michael Y.; Rotter, Jerome I.; Rich, Stephen S.; Mychaleckyj, Josyf C.

doi:10.1007/s00439-011-1071-0

Analysis of family- and population-based samples in cohort genome-wide association studies

Original Investigation
Published: 30 July 2011

Volume 131, pages 275–287, (2012)
Cite this article

Human Genetics Aims and scope Submit manuscript

Ani Manichaikul^1,2,
Wei-Min Chen^1,2,
Kayleen Williams³,
Quenna Wong³,
Michèle M. Sale^1,4,
James S. Pankow⁵,
Michael Y. Tsai⁶,
Jerome I. Rotter⁷,
Stephen S. Rich¹ &
…
Josyf C. Mychaleckyj¹

606 Accesses
13 Citations
8 Altmetric
2 Mentions
Explore all metrics

Abstract

Cohort studies typically sample unrelated individuals from a population, although family members of index cases may also be recruited to investigate shared familial risk factors. Recruitment of family members may be incomplete or ancillary to the main cohort, resulting in a mixed sample of independent family units, including unrelated singletons and multiplex families. Multiple methods are available to perform genome-wide association (GWA) analysis of binary or continuous traits in families, but it is unclear whether methods known to perform well on ascertained pedigrees, sibships, or trios are appropriate in analysis of a mixed unrelated cohort and family sample. We present simulation studies based on Multi-Ethnic Study of Atherosclerosis (MESA) pedigree structures to compare the performance of several popular methods of GWA analysis for both quantitative and dichotomous traits in cohort studies. We evaluate approaches suitable for analysis of families, and combined the best performing methods with population-based samples either by meta-analysis, or by pooled analysis of family- and population-based samples (mega-analysis), comparing type 1 error and power. We further assess practical considerations, such as availability of software and ability to incorporate covariates in statistical modeling, and demonstrate our recommended approaches through quantitative and binary trait analysis of HDL cholesterol (HDL-C) in 2,553 MESA family- and population-based African-American samples. Our results suggest linear modeling approaches that accommodate family-induced phenotypic correlation (e.g., variance-component model for quantitative traits or generalized estimating equations for dichotomous traits) perform best in the context of combined family- and population-based cohort GWAS.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Methods and results from the genome-wide association group at GAW20

Article Open access 17 September 2018

Xuexia Wang, Felix Boekstegers & Regina Brinster

Leveraging family history in genetic association analyses of binary traits

Article Open access 01 October 2022

Yixin Zhang, James B. Meigs, … Chloé Sarnowski

An adaptive gene-level association test for pedigree data

Article Open access 17 September 2018

Jun Young Park, Chong Wu & Wei Pan

References

Abecasis GR, Cardon LR, Cookson WO (2000) A general test of association for quantitative traits in nuclear families. Am J Hum Genet 66:279–292
Article PubMed CAS Google Scholar
Abecasis GR, Cherny SS, Cookson WO, Cardon LR (2002) Merlin—rapid analysis of dense genetic maps using sparse gene flow trees. Nat Genet 30:97–101
Article PubMed CAS Google Scholar
Agresti A (2002) Categorical data analysis, 2nd edn. Wiley-Interscience, New York
Book Google Scholar
Aulchenko YS, Struchalin MV, van Duijn CM (2010) ProbABEL package for genome-wide association analysis of imputed data. BMC Bioinformatics 11:134
Article PubMed Google Scholar
Bild DE, Bluemke DA, Burke GL, Detrano R, Diez Roux AV, Folsom AR, Greenland P, Jacob DR Jr, Kronmal R, Liu K, Nelson JC, O’Leary D, Saad MF, Shea S, Szklo M, Tracy RP (2002) Multi-ethnic study of atherosclerosis: objectives and design. Am J Epidemiol 156:871–881
Article PubMed Google Scholar
Bourgain C, Hoffjan S, Nicolae R, Newman D, Steiner L, Walker K, Reynolds R, Ober C, McPeek MS (2003) Novel case-control test in a founder population identifies P-selectin as an atopy-susceptibility locus. Am J Hum Genet 73:612–626
Article PubMed CAS Google Scholar
Chen WM, Abecasis GR (2007) Family-based association tests for genomewide association scans. Am J Hum Genet 81:913–926
Article PubMed CAS Google Scholar
Chen MH, Yang Q (2010) GWAF: an R package for genome-wide association analyses with family data. Bioinformatics 26:580–581
Article PubMed Google Scholar
Chen WM, Manichaikul A, Rich SS (2009) A generalized family-based association test for dichotomous traits. Am J Hum Genet 85:364–376
Article PubMed CAS Google Scholar
Cirulli ET, Goldstein DB (2010) Uncovering the roles of rare variants in common disease through whole-genome sequencing. Nat Rev Genet 11:415–425
Article PubMed CAS Google Scholar
Clopper C, Pearson E (1934) The use of confidence or fiducial limits illustrated in the case of the binomial. Biometrika 26:404
Article Google Scholar
Fulker DW, Cherny SS, Sham PC, Hewitt JK (1999) Combined linkage and association sib-pair analysis for quantitative traits. Am J Hum Genet 64:259–267
Article PubMed CAS Google Scholar
Kang HM, Sul JH, Service SK, Zaitlen NA, Kong SY, Freimer NB, Sabatti C, Eskin E (2010) Variance component model to account for sample structure in genome-wide association studies. Nat Genet 42:348–354
Article PubMed CAS Google Scholar
Lachance J (2010) Disease-associated alleles in genome-wide association studies are enriched for derived low frequency alleles relative to HapMap and neutral expectations. BMC Med Genomics 3:57
Article PubMed Google Scholar
Laird NM, Horvath S, Xu X (2000) Implementing a unified approach to family-based tests of association. Genet Epidemiol 19(Suppl 1):S36–S42
Article PubMed Google Scholar
Manichaikul A, Mychaleckyj JC, Rich SS, Daly K, Sale M, Chen WM (2010) Robust relationship inference in genome-wide association studies. Bioinformatics 26:2867–2873
Article PubMed CAS Google Scholar
Manolio TA (2009) Cohort studies and the genetics of complex disease. Nat Genet 41:5–6
Article PubMed CAS Google Scholar
Nicodemus KK, Luna A, Shugart YY (2007) An evaluation of power and type I error of single-nucleotide polymorphism transmission/disequilibrium-based statistical methods under different family structures, missing parental data, and population stratification. Am J Hum Genet 80:178–185
Article PubMed CAS Google Scholar
Price AL, Patterson NJ, Plenge RM, Weinblatt ME, Shadick NA, Reich D (2006) Principal components analysis corrects for stratification in genome-wide association studies. Nat Genet 38:904–909
Article PubMed CAS Google Scholar
Price AL, Zaitlen NA, Reich D, Patterson N (2010) New approaches to population stratification in genome-wide association studies. Nat Rev Genet 11:459–463
Article PubMed CAS Google Scholar
Psaty BM, O’Donnell CJ, Gudnason V, Lunetta KL, Folsom AR, Rotter JI, Uitterlinden AG, Harris TB, Witteman JC, Boerwinkle E (2009) Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) consortium: design of prospective meta-analyses of genome-wide association studies from 5 cohorts. Circ Cardiovasc Genet 2:73–80
Article PubMed Google Scholar
Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MA, Bender D, Maller J, Sklar P, de Bakker PI, Daly MJ, Sham PC (2007) PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 81:559–575
Article PubMed CAS Google Scholar
Rakovski CS, Stram DO (2009) A kinship-based modification of the armitage trend test to address hidden population structure and small differential genotyping errors. PLoS One 4:e5825
Article PubMed Google Scholar
Teslovich TM, Musunuru K, Smith AV, Edmondson AC, Stylianou IM, Koseki M, Pirruccello JP, Ripatti S, Chasman DI, Willer CJ, Johansen CT, Fouchier SW, Isaacs A, Peloso GM, Barbalic M, Ricketts SL, Bis JC, Aulchenko YS, Thorleifsson G, Feitosa MF, Chambers J, Orho-Melander M, Melander O, Johnson T, Li X, Guo X, Li M, Shin Cho Y, Jin Go M, Jin Kim Y, Lee JY, Park T, Kim K, Sim X, Twee-Hee Ong R, Croteau-Chonka DC, Lange LA, Smith JD, Song K, Hua Zhao J, Yuan X, Luan J, Lamina C, Ziegler A, Zhang W, Zee RY, Wright AF, Witteman JC, Wilson JF, Willemsen G, Wichmann HE, Whitfield JB, Waterworth DM, Wareham NJ, Waeber G, Vollenweider P, Voight BF, Vitart V, Uitterlinden AG, Uda M, Tuomilehto J, Thompson JR, Tanaka T, Surakka I, Stringham HM, Spector TD, Soranzo N, Smit JH, Sinisalo J, Silander K, Sijbrands EJ, Scuteri A, Scott J, Schlessinger D, Sanna S, Salomaa V, Saharinen J, Sabatti C, Ruokonen A, Rudan I, Rose LM, Roberts R, Rieder M, Psaty BM, Pramstaller PP, Pichler I, Perola M, Penninx BW, Pedersen NL, Pattaro C, Parker AN, Pare G, Oostra BA, O’Donnell CJ, Nieminen MS, Nickerson DA, Montgomery GW, Meitinger T, McPherson R, McCarthy MI et al (2010) Biological, clinical and population relevance of 95 loci for blood lipids. Nature 466:707–713
Article PubMed CAS Google Scholar
Third Report of the NCEP Expert Panel on Detection, Evaluation, and Treatment of High Blood Cholesterol in Adults (Adult Treatment Panel III) September 2002. National Institutes of Health, National Heart, Lung, and Blood Institute. NIH Publication No. 02-5215
Thornton T, McPeek MS (2007) Case-control association testing with related individuals: a more powerful quasi-likelihood score test. Am J Hum Genet 81:321–337
Article PubMed CAS Google Scholar
Thornton T, McPeek MS (2010) ROADTRIPS: case-control association testing with partially or completely unknown population and pedigree structure. Am J Hum Genet 86:172–184
Article PubMed CAS Google Scholar
Willer CJ, Li Y, Abecasis GR (2010) METAL: fast and efficient meta-analysis of genomewide association scans. Bioinformatics 26:2190–2191
Article PubMed CAS Google Scholar
Yang Y, Remmers E, Ogunwole C, Kastner D, Gregersen P, Li W (2011) Effective sample size: quick estimation of the effect of related samples in genetic case-control association analyses. Comput Biol Chem 35:40–49
Google Scholar
Zeger SL, Liang KY (1986) Longitudinal data analysis for discrete and continuous outcomes. Biometrics 42:121–130
Article PubMed CAS Google Scholar
Zhu X, Li S, Cooper RS, Elston RC (2008) A unified association analysis approach for family and unrelated samples correcting for stratification. Am J Hum Genet 82:352–365
Article PubMed CAS Google Scholar

Download references

Acknowledgments

MESA and the MESA SHARe project are conducted and supported by contracts N01-HC-95159 through N01-HC-95169 and RR-024156 from the National Heart, Lung, and Blood Institute (NHLBI). Funding for MESA SHARe genotyping was provided by NHLBI Contract N02‐HL‐6‐4278. MESA Family is conducted and supported in collaboration with MESA investigators; support is provided by grants and contracts R01HL071051, R01HL071205, R01HL071250, R01HL071251, R01HL071252, R01HL071258, R01HL071259. The authors thank the participants of the MESA study, the Coordinating Center, MESA investigators, and study staff for their valuable contributions. A full list of participating MESA investigators and institutions can be found at http://www.mesa-nhlbi.org.

Author information

Authors and Affiliations

Center for Public Health Genomics, University of Virginia, West Complex, 6th Fl, Suite 6111, P.O. Box 800717, Charlottesville, VA, 22908, USA
Ani Manichaikul, Wei-Min Chen, Michèle M. Sale, Stephen S. Rich & Josyf C. Mychaleckyj
Department of Public Health Sciences, Division of Biostatistics and Epidemiology, University of Virginia, Charlottesville, VA, USA
Ani Manichaikul & Wei-Min Chen
Collaborative Health Studies Coordinating Center, University of Washington, Seattle, WA, USA
Kayleen Williams & Quenna Wong
Department of Medicine, University of Virginia, Charlottesville, VA, USA
Michèle M. Sale
Division of Epidemiology and Community Health, University of Minnesota, Minneapolis, MN, USA
James S. Pankow
Department of Laboratory Medicine and Pathology, University of Minnesota, Minneapolis, MN, USA
Michael Y. Tsai
Medical Genetics Institute, Cedars-Sinai Medical Center, Los Angeles, CA, USA
Jerome I. Rotter

Authors

Ani Manichaikul
View author publications
You can also search for this author in PubMed Google Scholar
Wei-Min Chen
View author publications
You can also search for this author in PubMed Google Scholar
Kayleen Williams
View author publications
You can also search for this author in PubMed Google Scholar
Quenna Wong
View author publications
You can also search for this author in PubMed Google Scholar
Michèle M. Sale
View author publications
You can also search for this author in PubMed Google Scholar
James S. Pankow
View author publications
You can also search for this author in PubMed Google Scholar
Michael Y. Tsai
View author publications
You can also search for this author in PubMed Google Scholar
Jerome I. Rotter
View author publications
You can also search for this author in PubMed Google Scholar
Stephen S. Rich
View author publications
You can also search for this author in PubMed Google Scholar
Josyf C. Mychaleckyj
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Josyf C. Mychaleckyj.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (DOC 1572 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Manichaikul, A., Chen, WM., Williams, K. et al. Analysis of family- and population-based samples in cohort genome-wide association studies. Hum Genet 131, 275–287 (2012). https://doi.org/10.1007/s00439-011-1071-0

Download citation

Received: 23 February 2011
Accepted: 09 July 2011
Published: 30 July 2011
Issue Date: February 2012
DOI: https://doi.org/10.1007/s00439-011-1071-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Analysis of family- and population-based samples in cohort genome-wide association studies

Abstract

Access this article

Similar content being viewed by others

Methods and results from the genome-wide association group at GAW20

Leveraging family history in genetic association analyses of binary traits

An adaptive gene-level association test for pedigree data

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

Supplementary material 1 (DOC 1572 kb)

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Analysis of family- and population-based samples in cohort genome-wide association studies

Abstract

Access this article

Similar content being viewed by others

Methods and results from the genome-wide association group at GAW20

Leveraging family history in genetic association analyses of binary traits

An adaptive gene-level association test for pedigree data

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

Supplementary material 1 (DOC 1572 kb)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation