Research Perspective Volume 2, Issue 9 pp 612—620

Joint influence of small-effect genetic variants on human longevity

Anatoliy I. Yashin1,2, , Deqing Wu1, , Konstantin G. Arbeev1, , Svetlana V. Ukraintseva1,2, ,

  • 1 Center for Population Health and Aging, Duke University, Durham, NC 27708-0408, USA
  • 2 Duke Comprehensive Cancer Center, Duke University, Durham, NC 27708-0408, USA

Received: August 9, 2010       Accepted: August 25, 2010       Published: August 26, 2010      

https://doi.org/10.18632/aging.100191
How to Cite

Copyright: © 2010 Yashin et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Abstract

The results of genome-wide association studies of complex traits, such as life span or age at onset of chronic disease, suggest that such traits are typically affected by a large number of small-effect alleles. Individually such alleles have little predictive values, therefore they were usually excluded from further analyses. The results of our study strongly suggest that the alleles with small individual effects on longevity may jointly influence life span so that the resulting influence can be both substantial and significant. We show that this joint influence can be described by a relatively simple “genetic dose - phenotypic response” relationship.

The genome wide association studies (GWAS) were introduced to perform exhaustive analyses of genetic influence on complex traits. A number of recent publications emphasize that the approach did not entirely meet the expectations: Although GWAS provided important insights in genetics of particular disorders [1], it failed to detect a major portion of genetic influence on traits of interest [1-5]. In most cases genetic variants found in GWAS cannot explain heritability estimates calculated for such traits in the pre-genomic era. An important conclusion emerged from many such studies was that the complex traits are typically affected by a large number of common alleles, each of little predictive value, with small or statistically non-significant effect [1-5]. Recent suggestion to focus on the search for rare alleles with significant phenotypic effects in small population subgroups [6] requires new SNP data with minor allele frequencies (MAF) less than 1%. (Traditional GWAS deal will MAF >1%). More results could be obtained by sequencing selected areas of the genome [7,8]

In this paper we show that the use of extended approach to GWAS allows for addressing the issues of lost genetic influence on complex traits by analysing regularities of joint action of many small-effect-low-significance alleles. Using longevity trait as an example we show that the results of our analyses bring important insights into mechanisms of genetic regulation of this trait. In this approach we hypothesized that value of the complex trait (life span) depends on number of the small-effect “longevity” alleles, contained in individual genomes and tested this hypothesis using genome wide data on 550K SNPs from the original cohort of the Framingham Heart Study (FHS). The results show that the joint influence of small-effect alleles on life span is both significant and substantial and can be described as the “genetic dose - phenotypic response” relationship. The existence of such relationship brings a new perspective to GWAS of complex traits and can at least partly justify sizable efforts and resources that have recently been invested in GWAS.

We evaluated associations between 550,000 SNPs and life spans in 1,173 genotyped participants of the Framingham Heart Study (FHS) original cohort. After performing a standard quality control procedure [9], (call rate ≥80%; MAF>1%; HWE > 10-7) for each SNPs we evaluated parameters of the linear regression model by considering individuals' life span as function of SNP genotype (categorical variables) using code “0” for homozygote with respect to the major allele; “1” for heterozygote; and 2 for homozygote with respect to the minor allele. The SAS program SAS PROC REG (© SAS Institute, Inc.) has been used for this purpose. The SNPs for which the estimate of the slope parameter was positive and had p≤10-6 were selected as “longevity” SNPs. Note that this threshold is larger than 10-7 used in traditional GWAS with correction for multiple comparisons in data samples of similar size. This procedure resulted in selection of 169 “longevity” SNPs.

To evaluate joint effect of genetic variants on life span, we calculated the number of longevity SNPs (from selected set of 169 SNPs) contained in the genome of each individual in the study and performed regression analyses considering lifespan as a linear function of the number of longevity SNPs contained in person's genome. The estimates of both the intercept and the slope were positive and highly statistically significant (Figure 1).

The “genetic dose - phenotypic response” relationship between the numbers of selected 169 longevity alleles contained in individuals' genome and mean life span obtained in the analyses of 550K SNP data on participants of the original FHS cohort. Regression analyses were performed using SAS PROC REG (© SAS Institute, Inc.) with correction for heteroscedasticity.

Figure 1. The “genetic dose - phenotypic response” relationship between the numbers of selected 169 longevity alleles contained in individuals' genome and mean life span obtained in the analyses of 550K SNP data on participants of the original FHS cohort. Regression analyses were performed using SAS PROC REG (© SAS Institute, Inc.) with correction for heteroscedasticity.

The estimated dependence explained 21% of variance in life span. This estimate seems to be reasonable if one takes into account that narrow sense heritability in life span is estimated at the level about 25% [10]. The estimated relationship between life span and the number of longevity SNPs shown in Figure 1 is the main result of this paper. It shows that in studies of genetic determinants of longevity the joint influence of many small-effect genetic variants may be substantial. We suggest that similar “genetic dose” - “phenotypic response” relationship is likely to characterize genetic influence on many other complex traits.

The two aspects of performed analyses require additional testing. The first is the use of data on all genotyped individuals from the original FHS cohort, which include first degree relatives from 618 families. The second is the fact that the two procedures: (i) selection of longevity SNPs and (ii) testing the presence of their joint influence on life span used data on the same individuals. To check whether the exclusion of relatives from the list of study subjects modifies the results of analyses, we randomly selected 618 individuals, one from each family, identified a set of “longevity” SNPs using the procedure described above, and estimated dependence of life span on the number of selected longevity SNPs in these individuals. To diminish the effect of sampling, we repeated this procedure 10 times. In each such analysis, the estimates of slope and intercept were positive and highly statistically significant with p≤10-19. These results suggest that the conclusion about joint influence of longevity SNPs on life span does not depend on the presence or absence of relatives among the study subjects. To take into account variants selected in each experiment, we unified sets of longevity SNPs selected in each of 10 experiments. This procedure resulted in the set with 70 genetic variants. Note that the reduction in the number of study subjects (because of excluding genetically dependent individuals) increases the chances of selecting false positive variants. To diminish the number of such variants, we intersected the set of 70 SNPs with the set of 169 SNPs, selected earlier using data on the entire FHS cohort. This procedure resulted in 39 longevity SNPs.

This set of 39 SNPs was then used in regression analyses where life span was considered as a linear function of the number of longevity SNPs contained in person's genome. The result is shown in Figure 2.

The “genetic dose - phenotypic response” relationship between the numbers of selected 39 longevity alleles contained in individuals' genome and mean life span obtained in the analyses of 550K SNP data on participants of the original FHS cohort. Regression analyses were performed using SAS PROC REG (© SAS Institute, Inc.) with correction for heteroscedasticity.

Figure 2. The “genetic dose - phenotypic response” relationship between the numbers of selected 39 longevity alleles contained in individuals' genome and mean life span obtained in the analyses of 550K SNP data on participants of the original FHS cohort. Regression analyses were performed using SAS PROC REG (© SAS Institute, Inc.) with correction for heteroscedasticity.

One can see from this figure that the estimates of both the intercept and slope are statistically significant. The Figure 3 shows no dependence of life span from the number of SNPs taken randomly from the pool of SNPs without 39 selected longevity SNPs.

The absence of dependence between the numbers of randomly selected 39 genetic variants contained in individuals' genome and life span. These genetic variants were randomly selected from the same pool of SNPs excluding longevity alleles. Regression analyses were performed using SAS PROC REG (© SAS Institute, Inc.) with correction for heteroscedasticity.

Figure 3. The absence of dependence between the numbers of randomly selected 39 genetic variants contained in individuals' genome and life span. These genetic variants were randomly selected from the same pool of SNPs excluding longevity alleles. Regression analyses were performed using SAS PROC REG (© SAS Institute, Inc.) with correction for heteroscedasticity.

The analyses showed that the estimates of both the intercept and slope are highly statistically significant. The estimated dependence of life span on genes explains 19% of variance in life span, which is close to 21% estimated earlier. Thus, the presence of relatives in the population used for selecting longevity SNPs does not affect the conclusion about the presence of “genetic dose” - “phenotypic response” relationship. The fact that 39 selected SNPs explained almost the same percent of life span variance as 169 SNPs selected earlier (19% vs 21%) indicates that this set of SNPs deserves further analyses. Table 1 shows how selected SNPs are related to known genes.

Table 1.

Summary characteristics of the 39 SNPs revealed in the study and gene/protein functions for closest genes (known or suggested).

SNP rs#Chr #PositionAncestral alleleTypeDistance to geneClosest geneGene full nameGene/protein function
rs2031577104050003GINTERGENIC-17129RP11-433J20.2H. sapiens chr 10 clone RP11-433J20
rs648978512121363724CINTERGENIC-52622HNF1A (TCF1)HNF1 homeobox Aliver transcription factor
rs384768712131525053TINTRONIC0GPR133G protein-coupled receptor 133transmembranic signal transduser; activates G proteins within cell
rs48911591874101941GINTRONIC0ZNF516zinc finger protein 516the part of transcription factors
rs104454071779261809AINTRONIC0SLC38A10solute carrier family 38, member 10amino acid transporter
rs4745062973784264CINTRONIC0TRPM3transient receptor potential channelmediates calcium entry potentiated by calcium store depletion
rs20247142060212494CINTRONIC0CDH4R-cadherin (retinal)calcium-dependent cell-cell adhesion
rs731562112132085196GINTERGENIC-60412AC117500.2
rs169759631938325536GNON CODING GENE0AC016582.2
rs47320387134250322CINTRONIC0AKR1B15aldo-keto reductase family 1, member B15superfamily of reductases that reduce aldehydes and ketones to alcohols
rs2516739162097158N/AINTRONIC0NTHL1nth endonuclease III-like 1base excision repair; DNA N-glycosylase of the endonuclease III family
rs78741429137704782AINTRONIC0COL5A1collagen, type V, alpha 1regulates the assembly of heterotypic fibers in tissues
rs44688782059928237CINTRONIC0AL365229.1near CDH4possibly cell-cell adhesion
rs1300868928530256GINTERGENIC-153466AC011747.3
rs2273476889388CINTRONIC0SDAD1SDA1 domain containing proteinpreferentially expressed in fetal tissues
rs28822811390622455CINTERGENIC-21630RP11-388D4.1locus tag for a pseudogene
rs22820321490758891GINTRONIC0C14orf102chromosome 14 open reading frame 102
rs9876781348487338ANON CODING GENE0RP11-24C3.2
rs65684336106829537CINTERGENIC-39044AL109920.3
rs95173201399126303AINTRONIC0STK24serine/threonine kinase 24participates in the mitogen-activated protein kinase (MAPK) cascade
rs41485461395680285GINTRONIC0ABCC4ATP-binding cassette, sub-family C (CFTR/MRP)ATP-binding cassette (ABC) transporter
rs95927831371883214GINTERGENIC-128884DACH1dachshund homolog 1 (Drosophila)a chromatin-associated protein that regulates gene expression and cell fate; highly conserved
rs739401113036324TINTRONIC0CARScysteinyl-tRNA synthetasecatalyzes the aminoacylation of a tRNA;
rs1025697271039003CINTRONIC0C7orf50chromosome 7 open reading frame 50
rs32123351527012141CINTRONIC0GABRB3GABA A receptor, betaionic channel family that serves as the receptor for GABA; may be associated with memory
rs69151836166706169GINTERGENIC-12999PRR18proline rich 18
rs472113571912222GINTRONIC0MAD1L1MAD1 mitotic arrest deficient-like 1component of the mitotic spindle-assembly checkpoint
rs31065981361678912GINTERGENIC-304909PCDH20protocadherin 20transmembrane receptor, a role in specific cell-cell connections in the brain
rs1356888250516018CINTRONIC0NRXN1cell adhesion in nervous system
rs96169062251104680GUPSTREAM-3552AC000050.2
rs130531752237613309TUPSTREAM-7992RAC2ras-related C3 botulinum toxin substrate 2GTPase of the RAS superfamily regulating cell growth, cytoskelet, and the protein kinases activation
rs57666912247532396GINTRONIC0TBC1D22ATBC1 domain family
rs1311815941365127N/AINTRONIC0RP11-1244E8.1
rs71683651553805825CDOWNSTREAM-113WDR72WD repeat domain 72
rs74931381429021928CINTERGENIC-213122FOXG1forkhead box G1transcription factors
rs432203270764688AINTRONIC0TGFAtransforming growth factor, alphacompetes with EGF for binding to the EGF receptor
rs68134794137660383AINTERGENIC-57494RP11-138I17.1
rs13275339113131163TINTRONIC0SVEP1EGF and pentraxin domain containing 1
rs28268912122910116TINTRONIC0NCAM2neural cell adhesion molecule 2brain protein, superfamily of the immunoglobulin
*Enrichment with genes related to cell-cell adhesion can be noticed. Since cell-cell adhesion proteins play crucial role in cell sensitivity to contact inhibition and because insensitivity to contact inhibition is critical for cancer development, especially for manifestation of invasion and metastasis, we speculate that this enrichment might potentially be linked to a higher resistance to cancer among long-living individuals.

The second aspect mentioned above deals with prediction and replication. If the procedures, described above, do select longevity variants, and if the detected pattern of joint influence of such variants on life span is a property of a biological mechanism, then genetic variants selected using data on one population should be able to predict life spans in other genetically independent population of individuals who experienced similar environmental and living conditions. To test this, we randomly divided all 618 families into two groups. Data on individuals from the first 309 families plus data on 162 individuals with missing family identities were used for selecting SNPs having effect on life span. Then for each individual in the second (genetically independent) group we identified the number of such SNPs contained in person's genome. We estimated parameters of the linear regression model considering life span as function of the number of longevity variants contained in the genomes of individuals from the same (first) group and from the second (independent) group of individuals. To replicate the result, longevity SNPs selected from data on the second population were used for evaluating linear “genetic dose” - “life span response” relationship on the same population, as well as on the first population of individuals genetically independent from the second one. To reduce the sampling effect, the procedure of random division of the 618 families into two groups with subsequent selection of longevity variants and estimating regression coefficients in the “genetic dose - phenotypic response” relationship was repeated 10 times. The results are shown in Table 2.

#N1N2N1SNPN2SNPα1α1*α2α2*
16615125280.300.260.140.16
26894844090.420.350.210.23
362754618430.870.800.280.22
467749620250.670.630.560.49
568049334160.470.410.330.31
663054332220.480.390.460.48
763154243150.400.310.220.27
865851514390.860.990.330.25
964752631180.480.380.380.42
1067250137100.440.370.240.27
The results of 10 experiments in which genetic variants individually affecting life span (longevity SNPs) were selected twice using data on two populations representing genetically independent genotyped individuals in the original Framingham Heart Study (FHS) cohort for whom life span data are available. The longevity SNPs selected from data on the first population were used for evaluating linear “genetic dose” - “life span response” relationship on the same population, as well as on the second population of individuals. In turn, longevity SNPs selected from data on the second population were used for evaluating linear “genetic dose” - “life span response? relationship on the same population, as well as on the first population of individuals. Column “#” shows experiment's number. Columns N1 and N2 show the number of individuals in the first and in the second (genetically independent) populations. Columns N1SNP and N2SNP show the number of longevity SNPs selected using data on the first (original) and on the second (rest) populations respectively. Column α1 shows the estimate of the slope of the regression line describing dependence of life span on the number of longevity SNPs contained in the genomes of individuals from the first population. Column α1* shows the estimate of the slope of the regression line describing dependence between life span and the number of selected longevity SNPs contained in genomes of individuals from the second (independent) population. The estimates α1 and α1* use SNPs selected in the analyses of connection between SNPs and life span in the first (original) population. Column α2 shows the estimate of the slope of the regression line describing dependence of life span on the number of longevity SNPs contained in the genomes of individuals from the second population. Column α2* shows the estimate of the slope of the regression line describing dependence between life span and the number of selected longevity SNPs contained in genomes of individuals from the first population. The estimates α2 and α2* use SNPs selected in the analyses of connection between genes and life span in the second (rest) population. All four estimates are highly significant (p<1×10-10).

One can see from this table that the effect of the number of selected “longevity” SNPs on life span is significant in both groups. These analyses show that developed approach has predictive power, and that joint influence of longevity SNPs on life span can be replicated in populations of genetically independent individuals.

Some recent studies provide arguments that the Bonferroni corrections for multiple comparisons, traditionally used in GWAS, are too rigid and should be relaxed [11]. The results of this study support this view: many genetic variants involved in the “genetic dose - phenotypic response” relationship would not be selected by traditional GWAS methods. We found that relaxing the procedure for selecting longevity alleles (the use of selection threshold p≤10-6 instead of p≤10-7) increases the number of selected longevity SNPs having small effects and improve the fit of the life span data by the “number of longevity SNPs - life span” curve. This suggests that taking effect size of alleles into account may help reveal additional features of genetic influence on complex traits.

The possibility of using a straight line for approximating “the number of longevity SNPs -- life span” relationship indicates the presence of substantial additive component of the genetic contribution to longevity. It is relevant to note that additive genetic effects were the subject of numerous studies in quantitative genetics of the pre-genomic era. Many genetic calculations (e.g., estimates of narrow sense heritability of complex traits) were based on the assumption about the additive nature of genetic component of phenotypic variation. The availability of genome-wide data nowadays allows for evaluating such effects directly. Moreover, evaluating the non-additive (non-linear) joint genetic influence (epistasis) becomes also possible with the use of more sophisticated patterns of the “dose - response” relationship.

While the replication of findings became a standard requirement in GWAS, the results of our analyses suggest that in studying joint effect of many alleles this practice needs to be revised. Our analyses show that one should not expect that exactly the same sets of genetic variants will contribute to “genetic dose - phenotypic response” relationship evaluated using data on other population. One reason for this may be gene-environment interaction: difference in populations' exposure to external conditions is likely to produce difference in genetic regulation of the trait in these populations. Identification of genetic variants “sensitive” to specific external signals will open new opportunities for studying the role of genetic and non-genetic factors in complex traits.

Acknowledgments

The FHS project is conducted and supported by the NHLBI in collaboration with Boston University (N01 HC25195). The FHS data used for the analyses were obtained through dbGaP (phs000007.v3.p2). The authors acknowledge the investigators that contributed the phenotype and genotype data for this study. This manuscript was not prepared in collaboration with investigators of the FHS and does not necessarily reflect the opinions or views of the FHS, Boston University, or the NHLBI. This work was partly supported by NIH/NIA grant R01AG030612.

Conflicts of Interest

The authors of this manuscript have no conflict of interests to declare.

References

  • 1. Hardy J and Singleton A. Genomewide Association Studies and Human Disease. New Engl J Med. 2009; 360:1759-1768. [PubMed]
  • 2. Manolio TA, Collins FS, Cox NJ, Goldstein DB, Hindorff LA, Hunter DJ, McCarthy MI, Ramos EM, Cardon LR, Chakravarti A, Cho JH, Guttmacher AE, Kong A, Kruglyak L, Mardis E, Rotimi CN, Slatkin M, Valle D, Whittemore AS, Boehnke M, Clark AG, Eichler EE, Gibson G, Haines JL, Mackay TFC, McCarroll SA, Visscher PM. Finding the missing heritability of complex diseases. Nature. 2009; 461:747-753. [PubMed]
  • 3. Slatkin M. Epigenetic Inheritance and the Missing Heritability Problem. Genetics. 2009; 182:845-850. [PubMed]
  • 4. Visscher PM, Hill WG, Wray NR. Heritability in the genomics era - concepts and misconceptions. Nat Rev Genet. 2008; 9:255-266. [PubMed]
  • 5. Maher B. Personal genomes: The case of the missing heritability. Nature. 2008; 456:18-21. [PubMed]
  • 6. Goldstein DB. Common Genetic Variation and Human Traits. New Engl J Med. 2009; 360:1696-1698. [PubMed]
  • 7. Gravina S, Lescai F, Hurteau G, Brock GJ, Saramaki A, Salvioli S, Franceschi C, Roninson IB. Identification of single nucleotide polymorphisms in the p21 (CDKN1A) gene and correlations with longevity in the Italian population. Aging. 2009; 5:470-80. [PubMed]
  • 8. Vijg J. SNP'ing for longevity. Aging. 2009; 5:442-443. [PubMed]
  • 9. Lunetta KL, D'Agostino RB Sr, Karasik D, Benjamin EJ, Guo C-Y, Govindaraju R, Kiel DP, Kelly-Hayes M, Massaro JM, Pencina MJ, Seshadri S, Murabito JM. Genetic correlates of longevity and selected age-related phenotypes: a genome-wide association study in the Framingham Study. BMC Med Genet. 2007; 8:S13 [PubMed]
  • 10. Herskind AM, McGue M, Holm NV, Sorensen TIA, Harvald B, Vaupel JW. The heritability of human longevity: A population-based study of 2872 Danish twin pairs born 1870‒1900. Hum Genet. 1996; 97:319-323. [PubMed]
  • 11. Nyholt DR. A simple correction for multiple testing for single-nucleotide polymorphisms in linkage disequilibrium with each other. Am J Hum Genet. 2004; 74:765-769. [PubMed]