The hypothesis that genes involved in type
The hypothesis that genes involved in type 2 diabetes evolved under changing selective pressures and may carry the signature of natural selection motivated us to describe patterns of sequence variation at CAPN10 and GPR35 in human populations. An initial population genetics study reported an unusually large difference in allele frequencies between African and non-African populations for the CAPN10 variants shown to influence risk of type 2 diabetes (Horikawa et al. 2000; Fullerton et al. 2002); it was proposed that this pattern reflects the impact of population-specific selective pressures.
Here, we follow up on this initial observation by conducting a full resequencing survey of CAPN10 and GPR35, and we use additional aspects of genetic variation to make inferences about the RO 4929097 of these genes. The effects of evolutionary forces on patterns of variation are complex; each aspect of variation is shaped by the stochastic effects of drift over time, the demographic history of the population, and natural selection acting on specific loci. As a result, observations made at any one locus cannot disentangle the effects of demography from the effects of natural selection (Hamblin et al. 2002; Akey et al. 2004; Hammer et al. 2004; Stajich and Hahn 2005). To characterize the effects of demography on patterns of sequence variation, we previously resequenced 50 unlinked noncoding regions in three population samples (Hausa of Cameroon, Italians, and Chinese) (Frisse et al. 2001; L.M.F. and A.D., unpublished data). This multilocus data set represents an empirical null distribution that captures the effects of each population’s unique demographic history and therefore can point toward unusual observations that may represent the signature of natural selection. Furthermore, these data showed that the sample from Cameroon fits the expectations of a model of long-term constant population size and random mating, but the same model did not fit the non-African samples for multiple aspects of the data (Frisse et al. 2001; Pluzhnikov et al. 2002); these results are in agreement with other multilocus studies of human populations (Akey et al. 2004; Hammer et al. 2004; Stajich and Hahn 2005). Thus, in addition to an empirical comparison, we use the multilocus data set to estimate the parameters of the neutral equilibrium model for the Hausa (i.e., the population mutation rate, θ [= 4Neμ], and recombination rate, ρ [= 4Ner]), to run coalescent simulations; these simulations are aimed at assessing the fit of the CAPN10 data to the model. Here, we investigate the same population samples, as well as a sample of a native Mexican population (Mazatecans), to show that patterns of variation at the CAPN10 gene do not fit the expectations of the standard neutral model.
Material and Methods
Discussion Our analysis of two genes, CAPN10 and GPR35, within a positional candidate region for type 2 diabetes identified several interesting patterns that may result from the action of positive natural selection. First, in CAPN10, the haplotype class defined by the derived allele at SNP44, a polymorphism associated with increased diabetes risk, has a significant deficit of polymorphism, as expected if recent positive selection rapidly drove this haplotype class to high frequency. Second, a region of markedly high polymorphism and decay of LD was identified in intron 13 of CAPN10, which is consistent with a model of balancing selection involving multiple alleles. Although additional studies will be necessary to assess whether these findings support the thrifty genotype hypothesis, it is likely that selective pressures acting on this genomic region have changed over the time scale of human evolution and have shaped its patterns of variation. Of the two potential signatures of selection mentioned above, the low variability associated with the haplotypes carrying the derived allele at SNP44 may be most easily reconciled with selection on variation affecting diabetes risk. Recent meta-analyses have confirmed a role for this variant—or one in perfect LD with it—in diabetes risk. A meta-analysis of >7,000 cases and controls supported an association of the ancestral (C) allele at SNP44 with increased risk of type 2 diabetes, with an odds ratio (OR) of 1.17 and a P value of .0007 (Weedon et al. 2003). A separate meta-analysis of CAPN10 variation and type 2 diabetes reported a significant undertransmission of the derived (T) allele at SNP44 to affected offspring in the pooled sample from three family-based studies, with a pooled OR of 0.66 and a P value of .004 (Song et al. 2004). Functional studies also suggest a role for SNP44. On the basis of in vitro assays, it was proposed that SNP44 is located within an enhancer element and influences its activity (Horikawa et al. 2000). Although binding assays of nuclear extracts in HepG2 cells showed that the SNP44 polymorphism did not affect binding, a reporter gene assay showed that SNP44, in addition to SNP43, may modulate transcription of CAPN10. These results are compatible with the notion that SNP44 itself was the target of positive selection and one of the causative disease variants; however, the possibility that the signature of selection and the disease-association signal are both due to a polymorphic site in strong LD with SNP44 cannot be excluded. In fact, our data show that the derived allele at SNP44 is part of a long-range haplotype containing several other variants, including the derived allele at Thr504Ala (SNP110), an amino acid replacement in domain III of calpain-10. It is possible that Thr504Ala, alone or in combination with SNP44, was the target of positive natural selection and drove the association in disease-mapping studies. However, Thr504Ala is not the only polymorphism in strong LD with SNP44 that might be functional. For example, in all four population samples, SNP44 is in perfect LD with two polymorphisms in the 5′ UTR, SNP134 (position 17749) and SNP135 (position 17841) (table 2). Notably, at these two sites, it is the ancestral allele that is associated with the derived allele at SNP44. Given the rapid breakdown of LD in intron 13, it is unlikely that the target of selection resides in the unsurveyed region 3′ to CAPN10. Independent data show that LD breaks down (r2≤0.04) in populations from the major ethnic groups between SNP131 (position 4061), in the RNPEPL1 gene 5′ to CAPN10, and SNP66 (position 10676) (L. del Bosque-Plata and G. Hayes, unpublished data); this suggests that at least part of the RNPEPL1 gene can be excluded as the target of selection.