| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |












From the Department of Biological Sciences,
*
Tehran University, Tehran, Iran; the National Institute for Genetic Engineering and Biotechnology,
Tehran, Iran; the Stanford Genome Technology Center, Stanford University,
Palo Alto, California; Childrens Hospital Medical Center, GI Division,
Tehran University, Tehran, Iran; Silicon Genetics,
¶
South San Francisco, California; Shahid Beheshti University,
||
School of Medicine, Tehran, Iran; Esfahan University of Medical Sciences,
**
Esfahan, Iran; and Amir Kola Childrens Hospital,

Babol Medical University, Babol, Iran
| Abstract |
|---|
|
|
|---|
F508 (p.F508del), represented only 16% of the expected mutated alleles. The next most frequent mutations were c.1677del2 (p.515fs) at 7.5%, c.4041C>G (p.N1303K) at 5.6%, c.2183AA>G (p.684fs) at 5%, and c.3661A>T (p.K1177X) at 2.5%. Three of the five most frequent Iranian mutations are not included in a commonly used panel of CF mutations, underscoring the importance of identifying geographic-specific mutations in this population. | Introduction |
|---|
|
|
|---|
CF is the most common life-threatening autosomal recessive disease in many Caucasian populations, including those of Europe and the United States. Approximately one in 2000 to 3000 newborns in populations of European ancestry are affected, and the average carrier frequency is about 1:25.2, 3 The median survival age in economically developed countries is about 40 but is significantly less in other countries, such as Iran, where the incidence of CF has not been critically assessed.
Publication of the CFTR gene sequence in 1989 gave scientists the first opportunity to understand the molecular basis of cystic fibrosis (CFTR; MIM no. 60241). The p.F508del (
F508) allele of the gene is the most common mutation observed worldwide and is probably very old, dating to pre-Neolithic times.2, 4
In Europe, its frequency exhibits a distinct northwest to southeast gradient, ranging from a high of 88% in Denmark to a low of 24% in Turkey.5, 6, 7, 8, 9
There are two competing theories about the origin of this gradient. The first theory suggests the location with highest frequency of
F508 is where it was first introduced into Europe. The second theory suggests the mutation may confer an advantage to heterozygous carriers who are less vulnerable to bacterial infections because the CFTR gene product is used as a receptor.10, 11
In this scenario, the mutation would be propagated to different degrees depending on the amount of selective pressure. By analyzing the core haplotype background of the mutation in different geographic regions, it will be possible to distinguish between these two competing theories and dissect the origin and evolutionary history of a significant disease-causing mutation.
In addition to the
F508 mutation, more than 1000 other mutations in the CFTR gene have been identified (CF Genetic Analysis Consortium http://www.genet.sickkids.on.ca/cftr/). These mutations vary greatly in their frequency and distribution, but most are very rare. Only four (p.G542X, p.N1303K, p.G551D, and p.W1282X) have overall frequencies greater than 1%.12
Intriguingly, p.G542X and p.N1303K are found on the same haplotype background as
F508, suggesting that they arose in the same population.13
Two previous reports of CFTR mutations in Iran have been published. One reported the allele frequency of seven common mutations in exons 4 and 7 in 37 CF patients.14
The second one reported the allele frequency of the
F508 mutation in 24 CF patients.15
A relatively low frequency of 25% was reported, consistent with the frequencies of the mutation in neighboring countries. In the present study, all 27 of the CFTR exons of 60 unrelated Iranian CF patients (including the 24 previously studied) were sequenced with the aim of obtaining a profile of CFTR mutations in the Iranian population. In addition, the haplotypes associated with the mutations were assessed, and the carrier frequency was calculated based on the frequency of heterozygous patients.
Here, we describe 11 core haplotypes at the CFTR locus that are defined by 6 ancient bi-allelic polymorphisms. We believe these haplotypes will be useful for cross-comparison with other populations. We also describe the CFTR mutations found in the unique Iranian population. In ancient human history, Irans auspicious location made it a crossroads for travelers moving between Africa, Europe, India, and beyond. As such, it has a rich and valuable genetic legacy. By following disease-causing mutations in this population, we may come closer to tracking the origin of the most common disease-causing mutations outside of Africa. Finally, we hope that the knowledge of the unique mutation spectrum and the mutation frequency in this population will improve medical care CF patients in Iran.
| Materials and Methods |
|---|
|
|
|---|
Mutation Analysis
Each of the 27 exons of the CFTR gene, except exon 13 and its flanking intronic sequences, was amplified by the polymerase chain reaction (PCR). Multiplexing was not performed. Exon 13, because of its large size, was amplified in two overlapping segments. Sequences of most of the primers used have been reported.18, 19
Newly designed primers were used for amplification of exons 1, 2, 23, and 24 (1F, 5'-AAGGAGGAGAGGAGGAAGGA-3'; 1R, 5'-ACCCACATTTTCTTTCAAAACA-3'; 2F, 5'-GCCTGTAAGAGATGAAGCCTG-3'; 2R, 5'-TCAAACTCCTGGTCTCAAGCA-3'; 23F, 5'-TTAGAGTCTACCCCATGGTTGA-3'; 23R, 5'-AAAGCTGGATGGCTGTATGA-3'; 24F, 5'-ATTTTCCTTTGAGCCTGTGC-3'; and 24R, 5'-CATCCTTGTTTTCTGAGGCA-3'). The primers were synthesized by Operon (Qiagen Co., Alameda, CA). All PCR products were sequenced in both the forward and reverse direction using the same primers used in the PCR reactions. Sequencing was done using the ABI Big Dye Terminator system and an ABI Prism 3700 and 3730 DNA sequencer (Applied Biosystems, Foster City, CA). Sequences were analyzed using the Sequencher software (Gene Codes Corporation, Ann Arbor, MI).
Mutations and numbering were determined by comparison with the cDNA reference sequence for the CFTR gene (GenBank accession no. NM_000492), the protein sequence (NP_000483), and the genomic sequences AC000111 (first 18 exons) and AC000061 (last 9 exons). Every mutation was confirmed by sequencing at least two independent PCR amplification products. Predicted effects on splicing of mutations located in exons were determined by comparison with known exon splicing enhancer motifs (http://rulai.cshl.edu/tools/ESE/). Effects of intron mutations were determined by comparison with known canonical splice site motifs (http://www.fruitfly.org/seq_tools/splice.html).
Haplotype Analysis
Haplotypes were first defined by the eight most common intragenic polymorphisms noted among the 60 patients during the sequence analysis: c.86733del4 (six or seven repeats of TTGA; rs4148700), c.1001+11 (C/T; rs1800503), c.152561 (A/G; rs number not designated), c.1540 (A/G; Met/Val; rs213950), c.304192 (G/A; rs number not designated), c.360165 (C/A; rs213989), c.4006200 (G/A; rs214164), and c.4521 (G/A; Q1463Q; rs1800136). The frequencies of the minor alleles ranged from 14 to 45% of the CF chromosomes analyzed. Subsequently, six polymorphisms (all of the above except c.152561A>G and c.304192G>A) were shown to be sufficient to describe the haplotype diversity in this population and to compare it with a previously analyzed European cohort.
Four of the polymorphisms (c.152561A>G, c.1540A>G, 360165C>A, and c.4006200A>G) were previously reported in the German population.20 Haplotype analysis did not include some common markers found in other populations, such as c.2694T>G found in Korea.21 The ancestral alleles of all markers were determined based on which allele was present in both the mouse and chimpanzee reference sequences (http://genome.ucsc.edu).
The phase of haplotypes in 85% of individuals was unambiguous because they were either homozygous or polymorphic at only one of the marker sites. The remaining haplotypes were deduced by maximum parsimony and the computer program Varia (Silicon Genetics, South San Francisco, CA). The haplotypes predicted by both methods were consistent with one another. Wherever possible, the phase of an ambiguous haplotype was assumed to be that remaining after subtracting one of the two most common haplotypes. In addition, the most likely haplotypes were assumed to be those with a minimum number of mutations or recombination events from a known unambiguous haplotype.
For analysis with the computer program Varia, text files containing unphased genotypes for eight markers were imported after conversion into internal Silicon Genetics format. All variations found in build 119 of the National Institutes of Health single nucleotide polymorphism (SNP) database (http://www.ncbi.nlm.nih.gov) were given their corresponding rs numbers; all newly discovered ones were assigned internal identifiers. After calculation of unambiguous phases, an EM-type algorithm was applied to estimate a unique haplotype for the ambiguous phases.22 Next, the frequencies of the deduced haplotypes were used to construct the haplotype map.23, 24 Finally, a graphical representation of blocks along the chromosome was generated.
The haplotype blocks are the regions where the observed heterozygosity computed from haplotypes (HETobs) is less than the estimated heterozygosity computed from individual variations (HETest).25 The calculation of each block starts with a window (block) consisting of several consecutive variations. The score of the block is calculated with the equation S = HETobs/HETest. Next, the window slides along the chromosome and the score (S) is calculated for a new block, until the block with the best (the smallest) score is found. At last, to identify the final block, variations are added and removed at each end of the windowstopping when the best score is achieved. After the best block is found, the remaining variations are processed in the similar fashion, and more blocks are found.
The haplotype structure of the Iranian CF population was compared with a cohort of 88 unaffected individuals of European descent. The genotypes for the control group were downloaded from the National Center for Biotechnology Information Single Nucleotide Polymorphism database (http://www.ncbi.nlm.nih.gov/SNP) from the data sub-mitted by the International HapMap Project Consortium (http://www.hapmap.org) with the data ID designation "HapMap_chr7_CEU_BROAD_BEADARRAY". Only four of the six polymorphisms used to define our six-marker haplotypes were represented in the dataset. These included c.1540 (A/G; Met/Val; rs213950), c.360165 (C/A; rs213989), c.4006200 (G/A; rs214164), and c.4521 (G/A; Q1463Q; rs1800136).
Calculation of CF Carrier Frequency
The proportion of the patient population whose disease is not due to consanguinity can be used to calculate the frequency of CF-causing alleles in the population. The increase in incidence of disease due to consanguinity is expressed as X = (q + fp)/q, where q is the frequency of all mutated alleles, p is the frequency of normal alleles in the gene pool, and f is the average inbreeding coefficient.9
The maximum possible value for X is M/(h + H), where M is the number of patients in whom at least one mutation was found, h is the number of patients heterozygous for their mutation, and H is the number of patients homozygous for their mutation whose homozygosity is not due to consanguinity. Consanguinity in Iran is estimated at 25%.26
Assuming that one-half of consanguineous marriages are between first cousins and one-half are between second cousins, the average inbreeding coefficient was calculated to be
![]() |
| Results |
|---|
|
|
|---|
F508) was in frame and the other two caused a frameshift. One of the frameshift mutations (c.2183AA>G) also created an AG dinucleotide that could potentially act as a splice site. Two of the mutations were only recently reported, one affecting an amino acid alteration, c.3170C>T (p.P1013L), and another creating a stop codon, c.3661A>T (p.K1177X). They were reported in a Turkish and a Saudi Arabian population, respectively.28, 29, 30
|
F508 was the most common mutation, representing 16% of the 120 CF alleles that were completely analyzed (19 of 120), but only 14% of the 140 CF alleles that were at least sequenced for exon 10 (19 of 140). This frequency is far lower than in populations of European descent (approximately 60%) but comparable with several countries of West Asia and North Africa (Algeria, 20%; Lebanon, 35%; Tunisia, 18%; Pakistanis in the United Kingdom, 19%; Turkey, 24%; and Saudi Arabia, 15%).6, 9, 28, 30
The frequencies of the
F508 mutation in Iran and Saudi Arabia were the lowest yet reported for any country.
The next most frequent mutations were c.1677del2 (p.515fs) at 7.5%, c.4041C>G (p.N1303K) at 5.6%, c.2183AA>G (p.684fs) at 5%, and c.3661A>T (p.K1177X) at 2.5%. The two most frequent mutations were both in exon 10. These five most frequent mutations accounted for 37% of the mutated alleles in our patients. Five of the remaining detected mutant alleles were each found in two chromosomes and 10 were observed in only one chromosome.
Novel Polymorphisms
Twenty-one putative polymorphisms were detected in the CFTR gene of the Iranian patients (Table 2)
. The minor allele frequencies of eight polymorphisms were 14% or greater, and five were observed in only one chromosome. Eighteen were bi-allelic single-nucleotide polymorphisms, one (c.400690delC) was a single-basepair deletion, another (c.4005+121delTT) was a variation in the number of a single-nucleotide repeat (8T/6T), and a third (c.87631delTTGA) was a variation in the number of a four-nucleotide repeat (7TTGA/6TTGA). Six were exonic polymorphisms that were either silent for an amino acid change or demonstrated to have little or no effect on protein function (p.E92E, p.I148T, p.M470V, p.E1194A, p.P1290P, and p.Q1463Q). Two of the variants that altered an amino acid (p.M470V and p.E1194A) were observed in at least one patient with two other mutations, adding credence to their assignment as neutral polymorphisms (however, see below). The CF chromosome carrying p.E1194A also carried the common p.515fs mutation, because the patient was homozygous for the latter. Furthermore, the amino acid at 1194 is not conserved in chimpanzee or macaque. A third variant (p.I148T) was confirmed as a neutral polymorphism by a recent report.34
|
Four novel polymorphisms were observed in the Iranian CF patients, including c.29775C>A, c.408A>G (p.E92E), c.1716+8A>G, and c.3713A>C (p.E1194A). The polymorphism c.1716+8A>G is located near a splice junction and was observed in only one patient in whom no other disease-causing mutation was found; however, it did not alter the canonical splice consensus sequence and so is unlikely to affect splicing.
Haplotype Analysis
Haplotype analysis with respect to putative polymorphisms found within the CFTR gene of our patients was done to determine the extent of variation in the genetic background of different individuals carrying the same mutation and also to obtain an assessment of identity by descent in patients homozygous for a mutation. Figure 1
illustrates the haplotype map of the major haplotypes observed.9
|
All patients homozygous for the
F508 mutation also carried six copies of the TTGA repeat in intron 6a on both their chromosomes, consistent with its association with haplotype H5. One patient heterozygous for
F508 had seven copies of the repeat on both his chromosomes, probably indicating a recombination event. One of the haplotypes associated with the mutation p.K1177X was also probably recombined.
All patients who were homozygous for their putative disease-causing mutation were also homozygous for a haplotype defined by the six polymorphic markers, consistent with their alleles being identical by descent. However, homozygosity for a mutation in three of the patients, because of heterozygosity in rare polymorphisms, should not be considered to be due to identity by descent. Therefore, of the 20 homozygous patients, probably only 17 were homozygous because of identity by descent.
The haplotypes listed in Table 3
were defined using six high-frequency bi-allelic markers, including one repeat polymorphism and five SNPs. All of the markers are widely distributed across human populations in different parts of the world and are believed to be very old. Consequently, they define the core haplotype framework in this gene. Because the Iranian samples had a high frequency of homozygosity, we were able to unambiguously define 11 SNP haplotypes at the CFTR gene, representing 85% of the chromosomes analyzed. Seven of the 11 haplotypes formed a step-wise parsimony network of single-mutation events originating from an ancestral haplotype (H0), and four were best explained by recombi-nation. The Varia program predicted only 9 of the 11 un-ambiguous haplotypes. Furthermore, the breakpoints predicted by the recombinant haplotypes did not correspond with the "haplotype blocks" as defined by the Varia program. Possibly, the discrepancy is due to the assumption of the EM algorithm that the examined population has had random mating and exhibits Hardy-Weinberg equilibrium, both of which were not valid assumptions in the Iranian CF samples.
|
CF Carrier Frequency in Iran
The proportion of the patient population whose disease was not due to consanguinity was used to calculate the frequency of CF-causing alleles in the population. The increase in incidence of disease due to consanguinity in Iran was estimated to be f = 0.0098. This means q would equal 39/19 + 3 in our sample where 39 is the number of patients in whom at least one mutation was found, 19 the number of individuals heterozygous for their mutations, and 3 the number whose homozygosity was probably not due to consanguinity. Calculations yield a q value of 0.0125 and a carrier frequency (2pq) of 2.5 in 100, or 1 in 40. Even if all individuals homozygous for any mutation were considered to be from consanguineous marriages, the carrier frequency would only drop to 1.8 in 100, or 1 in 55.
| Discussion |
|---|
|
|
|---|
The five most common mutations in Iran constituted 37% of the alleles of the patients studied. Each of these was found at a frequency of 2.5% or more. Assuming random mating, it is expected that 60% of Iranian CF patients would carry at least one of these mutations. Therefore a molecular assay that could identify these five may be an appropriate preliminary diagnostic tool for the Iranian patients. As in Iran, a large spectrum of CFTR mutations has also been found in Spain, Bulgaria, Greece, and Turkey, all of which served as "historic gateways" into Europe.6
Although the frequency of the
F508 among the Iranian patients (14 to 16%) was lower than its frequency in European countries, it was still the most frequent mutation. The
F508 mutation most likely originated outside of Europe before it was introduced by migration, possibly from the Middle East.4, 45
As for patients of European, Bashkortostanian, and Turkish descent, the vast majority (18 of 19) of the
F508 alleles were associated with allele 6 of the TTGA repeat in intron 6b.9, 46, 47
All
F508 mutations were found on the same SNP haplotype background, consistent with a report that
F508 had extensive allele sharing of STR haplotypes in a French population48
and its complete linkage with a single variant of the intron 9 splice site polymorphism.49
All other CFTR mutations found on more than one chromosome, except p.Y563X, could be assigned to a single haplotype (Table 1)
. One of these (c.1677del2) was also associated with a single haplotype in Turkey, but another (c.2183AA>G) was associated with three haplotypes in that country.9
It is interesting that the c.1821C>A mutation was associated with two haplotypes. Two of the most common mutations among the Iranians were each associated with a single haplotype, suggesting a relatively recent introduction into the population followed by rapid expansion.
Because we defined haplotypes using SNP markers, we were only able to make limited comparisons with previously reported results. Simple tandem repeat (STR) haplotypes at the CFTR locus, based on analysis of microsatellite variation, have been defined in many different human populations.20, 43 STR markers mutate at a rate 1000-fold greater than SNP markers50 and generate multiple mutations at a single site. Thus, STR polymorphisms have generally occurred more recently, and are more diverse than SNPs. Thus, the 11 SNP haplotypes described here can be used to visualize population substructure farther in the past than is possible using STR haplotypes, and we hope they will be used for future comparisons of CFTR haplotypes across different populations.
The mutation spectrum in Iran had a significant overlap with that observed in Turkey and was less similar to that observed in Europe. Specifically, the four most common Turkish mutations were found in Iran, including
F508, c.1677delTA, p.G542X, and c.2183AA>G.9, 29
The p.G542X "Mediterranean mutation," purported to be of Phoenician origin, was found on only one Iranian chromosome, whereas it was relatively frequent (3.6%) among the Turkish CF chromosomes.6, 51
Another common mutations in Iran, p.K1177X, was not found in Turkey but was reported in Bahrain.39
In contrast, three mutations commonly found in Europe, including p.W1282X, p.G551D, and c.17171G>A,12
were not found in Iran. Two common mutations previously reported in Arab populations, including c.3601111G>C52
and c.3120+1kbdel8.6kb,53
were not assayed in this study because they were outside the genomic regions amplified by PCR. Six additional mutations reported in at least two other Arab populations, including c.711+1G>A, p.R75X, c.1548delG, c.3120+1G>A, c.3199del6, and p.S549R,30, 42, 53, 54, 55, 56
could have been detected in our assay but were not found in Iran.
Calculations of the frequency of carriers of disease-causing CFTR mutations in Iranians (1:40) is very similar to that of European populations (1:25), although there is greater allele diversity in Iran. Consequently, it is believed that the incidence of cystic fibrosis is also similarly high, and that the low incidence commonly believed to be associated with this non-European population is likely to be due to under-diagnosis. Furthermore, the frequency of CF may be higher in some isolated populations due to consanguinity, as has been previously reported in an isolated population in Israel.57 The head (A.K.) of the CF reference center in Iran, based on the history of patient referrals, corroborates that under-diagnosis of CF in Iran is notable. The same assessment has been proposed for populations of Turkey and Saudi Arabia.9, 28, 30 Therefore, in addition to Europe and the United States, CF is likely to be prevalent in many other countries as well, although the mutation spectrums may be different. We hope to ascertain CFTR mutation carrier frequencies and CF incidence among Iranians as soon as possible. A large cohort of patients is being collected for mutation analysis, and pilot trials for the screening of some variants in the general population has started.
Our mutation detection efficiency was a rather low 53%. We cannot be sure how much misdiagnosis contributed to this. Alternative methods of determining sweat chloride levels may be more accurate than those used in this study.58 Unfortunately, it is not possible to divide the patients into groups based on clinical and laboratory criteria because sufficient and uniform data on the patients was not available. Such a division may have identified groups that were responsible for overall low mutation detection. Some heterozygous point mutations may have gone undetected by both the Sequencher software and eye inspection of the sequence chromatograms. Use of denaturing high performance liquid chromatography has been shown to improve detection of CFTR mutations.59, 60 However, large heterozygous deletions and mutations in intronic and control regions not amplified would have been missed by either method. Finally, it is possible that genes other than the CFTR gene may have caused CF in some of the patients. It is possible that the state of modifier genes in combination with the genotype of particular polymorphisms would result in the disease phenotype. In any case, even with the low detection efficiency, a very notable level of heterogeneity in the CFTR mutation profile was found. In the similarly heterogeneous population of Turkey, detection efficiency was also a relatively low 75%.9
We hope that the recognition of a probable high incidence of CF in Iran will attract increased attention to the diagnosis of this disease. We also hope that the identification of five relatively frequent mutations will assist in the development of a clinically appropriate assay for their detection as a preliminary test for diagnosis.
| Acknowledgments |
|---|
| Footnotes |
|---|
Supported by the Research Council of the Faculty of Sciences of Tehran University (Iran), The National Research Center for Genetic Engineering and Biotechnology (Iran), and National Institutes of Health grant 5P01HG00205.
Accepted for publication September 30, 2005.
| References |
|---|
|
|
|---|
F508 mutation amongst Iranian cystic fibrosis patients and the detection of carriers of this mutation using the ARMS-PCR protocol. Med J Islam Repub Iran 1999, 16:278-286
F508) in European populations. Nat Genet 1994, 7:169-175[CrossRef][Medline]
F508 mutation: implications for parental diagnosis and mutation mutation origin. Am J Hum Genet 1991, 48:223-226[Medline]
G). Eur Respir J 1999, 13:100-102[Abstract]This article has been cited by other articles:
![]() |
F. Chitsazian, B. K. Tusi, E. Elahi, H. A. Saroei, M. H. Sanati, S. Yazdani, M. Pakravan, N. Nilforooshan, Y. Eslami, M. A. Z. Mehrjerdi, et al. CYP1B1 Mutation Profile of Iranian Primary Congenital Glaucoma Patients and Associated Haplotypes J. Mol. Diagn., July 1, 2007; 9(3): 382 - 393. [Abstract] [Full Text] [PDF] |
||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |