| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
Published online before print December 28, 2007
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Review Article |


From the Department of Pathology,
*Brigham and Womens Hospital and Harvard Medical School, Boston, Massachusetts; the Department of Medical Oncology,
Dana-Farber Cancer Institute, Boston, Massachusetts; and the Department of Internal Medicine,
Baylor University Medical Center, Dallas, Texas
Abstract
Molecular classification of colorectal cancer is evolving. As our understanding of colorectal carcinogenesis improves, we are incorporating new knowledge into the classification system. In particular, global genomic status [microsatellite instability (MSI) status and chromosomal instability (CIN) status] and epigenomic status [CpG island methylator phenotype (CIMP) status] play a significant role in determining clinical, pathological and biological characteristics of colorectal cancer. In this review, we discuss molecular classification and molecular correlates based on MSI status and CIMP status in colorectal cancer. Studying molecular correlates is important in cancer research because it can 1) provide clues to pathogenesis, 2) propose or support the existence of a new molecular subtype, 3) alert investigators to be aware of potential confounding factors in association studies, and 4) suggest surrogate markers in clinical or research settings.
Why Molecular Classification?
Colorectal cancer (CRC) is not a single disease. Rather, CRC encompasses a heterogeneous complex of diseases, and each CRC patient has a unique disease that has been caused by distinctive genetic/epigenetic background. Theoretically, every CRC arises and behaves in a unique fashion that is unlikely to be exactly recapitulated by any other CRC. Nonetheless, we believe that tumors with similar characteristics most likely arise or behave in a similar way. The purpose of tumor classification is to find similar characteristics among individual tumors, to predict empirically the pathogenesis and biological behavior of a particular tumor.
Tumor classification is historically based on various clinical (eg, proximal versus distal), pathological (eg, mucinous versus nonmucinous; well-moderate versus poorly differentiated), and/or molecular features [eg, microsatellite instability status (MSI)-high versus microsatellite stable (MSS)].1, 2, 3 Molecular classification is important because it reflects underlying mechanisms of carcinogenesis. In other cancers, for example, molecular classification of leukemias and lymphomas has considerably advanced the field over the past couple of decades.4, 5, 6, 7, 8 Although clinical and pathological classifications are largely phenotypic, clinicopathological features are nonetheless very important because some of the features are associated with particular underlying molecular defects and are thus useful in estimating the likelihood of a particular molecular subtype.
There are different ways of determining molecular classification. One can theoretically divide tumors into different groups by the presence or absence of any molecular event(s). However, as a primary discriminator in classification, emphasis should be put on molecular classification based on global cellular events [such as chromosomal instability (CIN), MSI, and CpG island methylator phenotype (CIMP)]. Nonetheless, single molecular events are also useful classifiers, in particular, for predicting response to targeted therapies against those molecules.
Why Molecular Correlates?
CRCs arise through a multistep carcinogenic process in which genetic and epigenetic alterations accumulate in a sequential manner.9, 10 Although these molecular alterations may occur in a stochastic fashion in many different cells, these alterations accumulate in a nonrandom fashion in a tumor, probably caused by selection advantages or disadvantages of many of these alterations. This nonrandom accumulation of molecular alterations creates an association between one alteration and another in tumors. Therefore, studying molecular correlates in tumor helps decipher nonrandomness in the multistep carcinogenic process. Molecular correlates in synchronous colorectal neoplasias support a nonrandom process of epigenetic alterations during carcinogenesis.11
The ultimate goal of studying molecular correlates is to identify clinically useful biomarkers, which correlate with patient survival or treatment response or help in treatment decision making or genetic counseling. Molecular correlates also advance further research; simple reports of some molecular correlates may give clues to other investigators.
The purposes of discovering molecular correlates in cancer research are multitude, and four major purposes are as follows.
Molecular Correlates Can Provide Clues to Pathogenesis
Positive correlations can support direct cause-effect relationship, and negative correlations can support the existence of mutually exclusive pathways of carcinogenesis. For example, an inverse association between CIN and MSI can propose the hypothesis that CIN and MSI represent mutually exclusive pathways of tumorigenesis. A mutually exclusive relationship between KRAS and BRAF activating mutations in CRC can confirm the theory that KRAS and BRAF are present in the same signal transduction pathway (see Table 1
for a full list of gene names and abbreviations). In contrast, a PIK3CA mutation in colorectal cancer tends to coexist with KRAS or BRAF mutation,12
supporting the parallel pathways of RAS-RAF-MAPK and PI3K-AKT and also suggesting synergistic effects of both pathways on the downstream mTOR (FRAP1) pathway. As another example, a strong correlation between CIN and Aurora kinase A amplification supports the hypothesis that Aurora kinase A amplification is one of the causes of CIN.13
|
The existence of a new molecular subtype can be supported by the presence of unique molecular correlates. For example, an association between CIMP-high and BRAF mutation supports the existence of CIMP-high as a distinct phenotype in CRC. An association between CIMP-low and KRAS mutation can propose CIMP-low as a new molecular subtype in CRC.14
Molecular Correlates Can Alert Investigators to Be Aware of Potential Confounding Factors in Association Studies
The presence of potential confounding factors is always a concern in any association study. For example, an investigator has found that MLH1 methylation is associated with resistance to 5-fluorouracil (5-FU)-based chemotherapy and may conclude that MLH1 methylation confers chemoresistance.15 However, MLH1 methylation is associated with MSI-high, and if chemoresistance is caused by MSI-high, but not by MLH1 methylation per se, anyone may be able to show the association between MLH1 methylation and chemoresistance. If this investigator knew the association between MLH1 methylation and MSI-high, this investigator could avoid a wrong conclusion.
As another example, investigators have found that an infiltrate of a subset of lymphocytes in CRC is associated with patient survival and concluded that a lymphocytic infiltrate kills tumor cells and helps patients live longer.16 However, in this study, longer survival may be due to the effect of MSI, which is associated with lymphocytic infiltrate. Alternatively, the known association between MSI and longer survival may be due to lymphocytic infiltrate. Unless these investigators perform multivariate analysis, including MSI data and lymphocytic infiltrate, results cannot support a causal link between lymphocytes and longer survival.
Molecular Correlates Can Suggest Surrogate Markers in Clinical or Research Settings
Surrogate markers are useful when it is difficult or impossible to test exactly what molecular alterations are present in an individual patient. There are many examples of good surrogate markers.
p53 immunohistochemistry is performed as a surrogate marker for the presence of p53 (TP53) mutation, because p53 positivity by immunohistochemistry correlates well with the presence of TP53 mutation and functional loss.17 Because mutations are distributed diversely in the TP53 gene, one may need to sequence the entire TP53 gene to achieve high sensitivity for mutation detection; however, it is often not practical to sequence the entire TP53 gene in a large-scale clinical study or clinical setting.
MSI markers (whether the 5 markers recommended by National Cancer Institute (NCI) or an expanded panel containing 10 or more markers) are good surrogate markers for global microsatellite instability level and underlying mismatch repair defect. Although it is not possible at this time to test all microsatellites in the human genome, testing 5 to 10 markers can certainly predict a defect of mismatch repair system and global microsatellite instability in CRC.
Molecular Classification of CRC
Global Molecular Classifiers: CIN, MSI, and CIMP
To study correlates with molecular events in CRC or patient outcomes, it is important to classify CRCs according to global genomic or epigenomic status. We herein discuss global molecular classifiers: CIN, MSI, and CIMP. Nonetheless, single molecular events are also useful classifiers, in particular for predicting response to targeted therapies against those molecules.
CIN
CIN appears to be a distinct phenotype in colorectal cancer, and tumors with CIN show frequent karyotypic abnormalities and chromosomal gains and losses.18
Allelic losses are quite common in CRC,19
and CIN is considered to promote carcinogenesis through loss of tumor suppressors and copy number gains of oncogenes. Although the occurrences of chromosomal abnormalities may be more stochastic than nonrandom, selection process can make a nonrandom pattern of chromosomal aberrations in tumor cells. CIN and MSI tend to be mutually exclusive in CRC.20
CIN has been commonly assessed by DNA ploidy analysis or loss of heterozygosity (LOH) analyses of microsatellite markers. For LOH analyses, markers in the 18q region have been shown to be generally more sensitive, compared with markers in other chromosomal regions such as 1p, 2p, 3p, 5q, 8p, and 17p.21, 22, 23, 24, 25 However, markers and criteria for CIN have not been standardized. In addition, LOH analyses are prone to have false-positive results (due to PCR bias or allele dropout), false-negative results (due to PCR bias or contamination of non-neoplastic cells), and uninformative results (due to homozygosity in a particular marker or unavailability of normal germline DNA).
Recently, array-based comparative genomic hybridization (array-CGH) and single nucleotide polymorphism (SNP) arrays have been used to study copy number gains/losses and LOH. Compared with conventional CGH, both techniques have higher resolution in genome-wide analysis of DNA copy number gains and losses. Major issues are assay cost and a requirement of high quality DNA. Although technically challenging, application to paraffin-embedded tissue is possible.
CIN may represent a heterogeneous phenomenon. CRC can have multiple reciprocal translocations with little changes in allele copy numbers or DNA content.26 Such a CRC would be misclassified as CIN negative by copy number variation assays including LOH or array-CGH. The existence of different mechanisms of CIN (whole chromosomal LOH, mitotic recombination, and mitotic gene conversion) is also suggested by a comprehensive study using array-CGH, SNP arrays, and multicolor fluorescence in situ hybridization (FISH).24
Causes of CIN are also likely heterogeneous. Mutations in genes encoding mitotic checkpoint proteins such as BUB1 and BUB1B (BUBR1) may cause CIN in a subset of CRCs.27 In addition, abnormal centrosome number and function have been a candidate mechanism for CIN. Amplification of AURKA (Aurora kinase A, STK15/BTAK), a centrosome-associated serine threonine kinase, has been found in a colon cancer cell line28 and is correlated with CIN in colon cancer.13 Overexpression of Aurora kinase A can induce aneuploidy in various cell lines.29 Other candidate causes of CIN include APC,30, 31 TP53,32 FBXW7 (CDC4 ubiquitin ligase),33 CHFR34, 35 (for controversial view, see Ref.36 ), and JC virus37, 38 (for controversial view, see Ref.39 ).
MSI
MSI refers to altered lengths ("instability") of short nucleotide repeat sequences ("microsatellites") in tumor DNA compared with normal DNA.40
It has also been referred to as RER (replication error), mutator phenotype, and microsatellite instability (MIN); however, MSI has become the most commonly used term. MSI has been suggested as a carcinogenic mechanism alternative to the CIN pathway.40
Mutations of coding mononucleotide repeats in tumor suppressor genes such as the transforming growth factor (TGF)-β receptor type 2 (TGFBR2) and BAX have been shown to be important in carcinogenesis.41, 42
A high degree of MSI (MSI-H) has been shown to be due to defects in the DNA mismatch repair system. Functional loss of MLH1 due to promoter methylation and gene silencing is the most common cause of MSI, particularly in sporadic MSI-H cancer. In contrast, in the setting of hereditary nonpolyposis colorectal cancer (HNPCC)/Lynch syndrome, mutations in any of the mismatch repair genes, MSH2, MLH1, MSH6 and PMS2, can cause MSI.43
MSI-H is present as a distinct phenotype in approximately 15% of CRCs.18
A number of pathological features have previously been linked with MSI-H, such as mucinous differentiation, signet ring cell morphology, Crohns-like lymphoid reaction, abundant tumor-infiltrating lymphocytes, tumor necrosis, and poor differentiation.2, 44, 45, 46, 47
MSI is typically assessed by analyzing five microsatellite markers (D2S123, D5S346, D17S250, BAT25, and BAT26) referred to as the NCI consensus panel,48 but additional microsatellite markers are commonly tested to increase the accuracy of classification. In clinical settings, MSI testing has been performed as a screening test for the identification of HNPCC/Lynch syndrome and sometimes as a prognostic marker (generally MSI-H implying a better prognosis) or a marker for predicting efficacy of chemotherapy (generally MSI-H implying resistance) (see Molecular Classification and Patient Outcomes).
MSI-Low versus MSS
Whether MSI-low (MSI-L) exists as a distinct phenotype from MSS has been controversial.49, 50
A study has shown that virtually all CRCs show some degree of microsatellite instability when a large number of markers are tested.49
The study concluded that a difference between MSI-L and MSS is merely quantitative and that it is unlikely that there are qualitatively different genetic pathways to MSI-L tumors and MSS tumors.49
This study demonstrates that microsatellites may not be the best markers to identify MSI-L because there is no discrete phenotype or genotype associated with MSI-L determined by microsatellite markers.49
Thus, a newer MSI marker panel has been designed to separate a substantial number of MSI-L tumors into MSS and MSI-H.51, 52
In contrast, there is evidence that supports the existence of MSI-L. MSI-L has been associated with shorter survival in Stage C colon cancer, compared with MSS tumors.53 A cDNA microarray expression study has also supported MSI-L as a distinct phenotype from MSS and MSI-H.54 MSI-L has been associated with MGMT methylation and loss.55 The association between MSI-L and MGMT methylation/loss is particularly strong among CIMP-low tumors.56 Among CIMP-low tumors, frequency of MGMT methylation and loss is much higher in MSI-L tumors than in MSI-H and MSS tumors; thus, MSI-L cannot be a mixture of MSI-H and MSS tumors.56 These data collectively support differences between MSI-L and MSS in colorectal cancer, although more studies are necessary to establish definitively the existence of MSI-L as a distinct phenotype. If MSI-L exists, additional studies are necessary to find underlining molecular defects for MSI-L and better biomarkers for MSI-L (maybe markers other than microsatellites).
CIMP
Transcriptional inactivation by cytosine methylation at promoter CpG islands of tumor suppressor genes is an important mechanism in human carcinogenesis, and a number of tumor suppressor genes have been shown to be silenced by promoter methylation in CRC.57, 58, 59, 60, 61, 62
In fact, a subset of CRCs have been shown to exhibit widespread promoter CpG island methylation, which is referred to as the CIMP.57, 63, 64
CIMP has been established as a unique epigenetic phenotype in colorectal cancer, and CIMP-high colorectal tumors have a distinct clinical, pathological, and molecular profile, such as associations with proximal tumor location, female sex, poor differentiation, MSI, and high BRAF and low TP53 mutation rates.65, 66, 67, 68, 69, 70, 71, 72, 73
Even within MSI-H tumors and within MSI-L/MSS tumors, CIMP-high has been associated with proximal location,14
poor differentiation,46
BRAF mutation,71, 72
and loss of nuclear p27 [cyclin-dependent kinase inhibitor 1A (CDKN1B)]74
and inversely associated with TP53 aberrations,75
loss of p21 (CDKN1A),75
overexpression of cyclooxygenase-2 (PTGS2),76
and cytoplasmic mislocalization of p27.77
Within MSI-H tumors, CIMP-high has been associated with TGFBR2 mononucleotide mutation.78
Using analyses on a large number of CpG island methylation markers, CIMP-high tumors form a distinct group by an unsupervised cluster analysis.73
These data collectively contradict the claims that CIMP does not represent a distinct phenotype in CRC and that characteristics of CIMP merely reflect those of MSI-H tumors.79, 80
The serrated pathway of tumorigenesis has been suggested in the development of CIMP-high colorectal cancer,81, 82, 83, 84
whereas flat-type adenomas do not appear to show frequent promoter methylation.85
At the present time, the panel of methylation markers and the method of assessment of CIMP is not standardized, although recent studies have found a fairly sensitive and specific identification of CIMP-high using MethyLight technology and evaluation of new panels of four to eight CpG islands (CACNA1G, CDKN2A, CRABP1, IGF2, MLH1, NEUROG1, RUNX3, and SOCS1).73, 86 Any of these panels is a useful tool at this time to examine CRCs to diagnose CIMP-high versus non-CIMP-high.
Another question that has yet to be resolved is whether there are any sporadic MSI-H tumors that exhibit neither MLH1 methylation nor the CIMP phenomenon. Again, a recent study suggested that all sporadic MSI-H tumors were explainable by CIMP and MLH1 methylation,73
whereas other studies have suggested that there may be a subset of sporadic MSI-H non-CIMP-high tumors.71, 86
The frequency of HNPCC/Lynch syndrome in the general population is estimated to be 1 to 3%.87
A large population-based study has suggested that the frequency of MSI-H non-CIMP-high tumors is
5%86
; thus, it is likely that nearly one-half or more of MSI-H non-CIMP-high tumors do not arise through either HNPCC/Lynch syndrome or the CIMP-high pathway. Thus, the absence of CIMP-high in MSI-H tumors does not necessarily indicate HNPCC/Lynch syndrome, although it increases the likelihood of HNPCC/Lynch syndrome.
CIMP-High versus CIMP-Low versus CIMP-0
The original report of CIMP in 1999 used type C (cancer specific) methylated in tumor (MINT) clones as markers for CIMP, and methylation in MINT markers is correlated with CDKN2A (p16) methylation.63
The subsequent study demonstrated that CIMP determined by MINT markers exhibited distinct genetic features, including associations with KRAS mutation and wild-type TP53.65
Studies by other investigators basically used a similar panel of markers including MINT1, MINT2, and MINT31.66, 67, 68, 69, 71
In these studies, although the inverse association between CIMP and TP53 mutation appears to be consistent, the association between CIMP and KRAS mutation is not consistent; some studies have shown a positive correlation,65, 71
whereas another study showed a negative correlation.66, 88
Since BRAF mutation in CRC was first discovered, BRAF mutation has consistently been associated with CIMP or CIMP-high.69, 70, 71, 72, 73 It has been shown that MINT markers are not highly specific for CIMP-high tumors with BRAF mutation.73 Thus, the existence of CIMP-low that is separate from CIMP-high and CIMP-negative has recently been hypothesized.72 Using a panel of methylation markers that are specific for BRAF-mutated CIMP-high tumors, we have shown that CIMP-low is associated with KRAS mutation and male sex, whereas CIMP-high is associated with BRAF mutation and female sex, and CIMP-0 (CIMP-negative) is associated with wild-type BRAF/KRAS.14 Because the frequency of KRAS mutation in CIMP-low tumors is higher than in CIMP-high and CIMP-0 tumors, CIMP-low cannot be a mixture of misdiagnosed CIMP-high and CIMP-0.14 These findings also explain why previous studies using MINT markers and lower cutoffs for CIMP showed the positive correlation between CIMP and KRAS mutation. CIMP-positive tumors determined by MINT markers probably represent a heterogeneous group of tumors including substantial numbers of CIMP-low tumors with a high frequency of KRAS mutation.
A difference between CIMP-low and CIMP-0 is not as clear-cut as that between CIMP-low and CIMP-high, probably because methylation markers (CACNA1G, CDKN2A, CRABP1, IGF2, MLH, NEUROG1, RUNX3, and SOCS1) are specific for CIMP-high but are not ideal for the identification of CIMP-low.14, 86
In future studies, markers that are sensitive and specific for CIMP-low need to be determined for the identification of CIMP-low, if CIMP-low really exists. Nonetheless, a difference between CIMP-low and CIMP-0 is more striking in terms of MGMT methylation and loss of expression.56
Among CIMP-low tumors, MSI-L tumors showed a significantly higher frequency of MGMT methylation/loss than MSI-H and MSS tumors, but no such relationship was observed in CIMP-0 tumors.56
Figure 1
represents current knowledge on different CIMP subtypes in CRC. The term "CIMP-0" is used to avoid confusion; "CIMP-negative" has been used for either "CIMP-low/0" or "CIMP-0".
|
|
|
Group 1: MSI-H CIMP-High (10%86 )
This group of tumors commonly shows MLH1 methylation, BRAF mutation, CIN negative, wild-type TP53, intact p21 (CDKN1A) expression,75
loss of nuclear p27 (CDKN1B),74
poor differentiation, lymphocytic reactions, and mucinous and/or signet ring cell features. Clinically, this is generally known as sporadic MSI-H and is associated with good prognosis, elderly female, and proximal colon.
Group 2: MSI-H CIMP-Low/0 (5%86 )
This group of tumors includes HNPCC/Lynch syndrome (1 to 3%). However, because the frequency of HNPCC/Lynch syndrome is estimated to be 1 to 3% of CRCs in the general population,87
nearly one-half or even a majority of MSI-H CIMP-low/0 tumors are sporadic and unrelated to HNPCC/Lynch syndrome. Recently identified CIMP-high-specific markers have a high power of separating CIMP-low/0 from CIMP-high in MSI-H tumors with virtually no overlap between CIMP-high and CIMP-low/0.86
Thus, the 5% estimated frequency based on almost 900 tumors is quite accurate.86
This group (MSI-H CIMP-low/0) of tumors are associated with KRAS mutation, wild-type TP53, CIN negative, fatty acid synthase overexpression,89
proximal colon (compared with MSI-L/MSS CIMP-low/0),86
lymphocytic reactions, and mucinous features, but not with poor differentiation or signet ring cell features.46
Therefore, the presence of poor differentiation or signet ring cells perhaps by itself does not increase the likelihood of HNPCC/Lynch syndrome. Prognostic significance of sporadic MSI-H CIMP-low/0 needs further investigation.
Group 3: MSI-L/MSS CIMP-High (5 to 10%86 )
This group of tumors commonly shows BRAF mutation,71, 72
wild-type TP53,71, 75
CIN negative,90
poor differentiation, and signet ring cell features46
and is associated with poor prognosis,91
elderly female, and right colon.14, 71
Group 4: MSI-L CIMP-Low (
5%56 )
This group of tumors is associated with high frequencies of MGMT methylation and KRAS mutation.56
Previous studies have shown an association between MSI-L and MGMT methylation,53, 55
which is due to the association between MSI-L and MGMT methylation among CIMP-low tumors.56
Group 5: MSS CIMP-Low (30 to 35%56 )
This group of tumors is associated with KRAS mutation,14
CIN negative,92
and male sex.14
Group 6: MSI-L/MSS CIMP-0 (
40%56 )
This group of tumors are associated with CIN,90, 92
wild-type KRAS/BRAF,14
and distal colon14
and show no sex predilection.14
Molecular Correlates in CRC
General Approach to Molecular Correlates Based on MSI/CIMP Classification
As mentioned previously, molecular classification based on global genomic/epigenomic aberrations, including CIN, MSI, and CIMP, is important. However, classification based on CIN is a challenge for several reasons. Currently, the methods/markers and criteria for CIN are far from uniform. LOH analysis has been known to have false positive and false negative results and substantial uninformative results. Thus, we illustrate how MSI/CIMP classification can decipher correlates with other molecular alterations.
According to MSI status, CRCs can be classified into two categories, MSI-H and MSI-L/MSS, because a distinction between MSI-L and MSS is subtle. According to CIMP status, CRCs can also be classified into two categories, CIMP-high and CIMP-low/0, because a distinction between CIMP-low and CIMP-0 is subtle. Considering the status of both MSI and CIMP, CRC can be classified into four major groups: MSI-H CIMP-high (10% of all CRCs; group 1), MSI-H CIMP-low/0 (5% of all CRCs; group 2), MSI-L/MSS CIMP-high (5 to 10% of all CRCs; group 3), and MSI-L/MSS CIMP-low/0 (75 to 80% of all CRCs; groups 4, 5, and 6) (Figure 3)
. Because each of MSI-H CIMP-low/0 and MSI-L/MSS CIMP-high constitutes only 5 to 10% of all CRCs, a large number of samples are required to properly dissect molecular correlates using the combined MSI and CIMP classification system.
The classification system based on combined MSI and CIMP status is useful in analyzing molecular correlates. For example, one can examine BRAF mutation frequencies in these four subtypes of colorectal cancer as in Figure 4A
(data in Ref.86
). This figure shows much higher BRAF mutation frequencies in MSI-H CIMP-high and MSI-L/MSS CIMP-high tumors than in MSI-H CIMP-low/0 and MSI-L/MSS CIMP-low/0 tumors. It is evident that MSI status has no effect on BRAF mutation frequencies. These results indicate that BRAF mutation is positively correlated with CIMP-high, independent of MSI status. As another example, in Figure 4B
(data in Ref.76
), both MSI-H and CIMP-high exhibit synergistic effect of lowering the frequency of p53 positivity (by immunohistochemistry); ie, both MSI-H and CIMP-high are inversely associated with p53 positivity. Because p21 (CDKN1A) is one of the major downstream effectors of p53, the interrelationship between p53, p21, and MSI (or CIMP) is examined in Figure 4C
(data in Ref.75
). After CRCs are stratified by p53 and p21 status, p53 status exhibits very little effect on the frequencies of CIMP-high and MSI-H. In contrast, p21 loss is inversely correlated with CIMP-high and MSI-H regardless of p53 status.
|
|
|
Pathological Features and MSI/CIMP
Various pathological features have been associated with MSI-H, but few studies with a large sample size have examined the relationship between pathological features and MSI/CIMP.46, 71, 86
Pathological features have been used to assess risks of HNPCC/Lynch syndrome. It is widely thought that poor tumor differentiation increases the likelihood of HNPCC/Lynch syndrome because of the association of poor differentiation with MSI-H. However, recent data indicate that poor differentiation is associated with CIMP-high tumors, but not MSI-H CIMP-low/0 tumors (which include most HNPCCs).86
Thus, it is unlikely that poor differentiation alone increases HNPCC risk. On the other hand, mucinous features (
50% mucinous), lymphocytic reactions, and proximal tumor location increase HNPCC risk because these features are associated with both MSI-H CIMP-high and MSI-H CIMP-low/0.46, 86
Molecular Classification and Patient Outcomes
There are numerous studies on individual molecular alterations (such as KRAS mutation, TP53 mutation, etc) and patient outcomes. Discussion in this article focuses on global genomic and epigenomic status and patient outcomes. A correlation of a single molecular event (such as TP53 mutation, BRAF mutation, etc) with patient outcomes needs to be interpreted with caution; the molecule examined might be associated with global genomic or epigenomic aberrations and improved or adverse outcomes might be caused by aberrations in other molecules.
With regard to MSI status and CRC patient survival, a systematic review of 32 studies that reported survival data on a total of 7642 colorectal cancer patients, including 1277 with MSI-H tumors, showed that MSI-H tumors were associated with better prognosis compared with MSS tumors [the combined hazard ratio estimate for overall survival associated with MSI-H was 0.65 (95% confidence interval, 0.59 to 0.71)].105 In a meta-analysis of six studies106, 107, 108, 109, 110, 111 that investigated overall survival stratified by MSI status in patients who received adjuvant 5-FU, patients with MSI-H tumors had a better prognosis (hazard ratio, 0.76; 95% confidence interval, 0.65 to 0.88).105 In addition, MSI-H tumors showed no benefit from adjuvant 5-FU.106, 109, 112 However, other studies have shown no predictive value of MSI status on overall survival of CRC patients treated with adjuvant 5-FU-based chemotherapy or on survival benefit from adjuvant 5-FU-based chemotherapy.113, 114, 115 Although evidence has been accumulating for MSI-H as a good prognostic indicator, additional investigation is needed to understand the mechanisms by which MSI influences colorectal cancer survival.
MSI-H tumors frequently show mutations in TGFBR2.116, 117, 118 TGFBR2 mediates signaling from TGF-β to its signal transducers, such as SMADs and further downstream targets, and functions as a tumor suppressor.18 Mutations in TGFBR2 observed in MSI-H cancers truncate and inactivate the TGFBR2 protein, abolishing its growth-regulating function.41 Among MSI-H cancers, TGFBR2 mutations have been associated with a significantly improved survival in one population of patients with stage III colon cancer.107 In a separate analysis, the survival benefit of TGFBR2 mutations in MSI-H tumors appeared to be particularly strong in the presence of coexistent BAX mutations.119 However, other studies have failed to confirm these data with one study finding worse survival associated with TGFBR2 mutations among 16 MSI-H tumors118 and another analysis finding no influence of TGFBR2 mutations on survival among 174 MSI-H tumors.117 Additional studies are necessary to assess whether survival benefit of MSI-H status is influenced by the presence of TGFBR2 or BAX mutation.
Why MSI-H tumors show better prognosis is currently unknown. A study has shown that MSI-H is not an independent predictor of survival in a multivariate model including DNA ploidy status.120 Although DNA ploidy may be a crude measure of CIN status, this study suggests that the association between MSI-H and better survival may be due to the confounding effect of CIN, which is a predictor of worse survival independent of MSI status.120 Another study has also shown that CIN is an independent predictor of worse survival, but survival is not correlated with MSI-H when CIN-positive tumors are excluded.121
With regard to the influence of CIMP on CRC patient survival, previous studies have been conflicting. Although some studies have found no significant relationship,68, 122 one study did demonstrate poor prognosis associated with CIMP in non-MSI-H tumors but not in MSI-H tumors,91 and another study has also shown that CIMP in advanced MSS tumors treated with chemotherapy predicts worse survival.123 BRAF mutations are associated with poor prognosis in non-MSI-H tumors (although no effect was seen on the good prognosis of MSI-H tumors).122 Because non-MSI-H tumors with BRAF mutations are most likely CIMP-high,71, 72 it is possible that the relationship between prognosis and CIMP-high is actually due to the relationship between prognosis and BRAF mutation.
It is also controversial whether CIMP confers survival benefit from chemotherapy in CRC. In one study of stage III colorectal cancer treated with surgery and adjuvant chemotherapy,124 patients with CIMP-positive tumors experienced a significant survival benefit from chemotherapy in contrast to those with CIMP-negative tumors, and this effect was independent of MSI and p53 mutation status; however, this study was not designed to randomly assign treatment groups. Therefore, an unidentified confounding factor cannot be excluded. Randomized trials are necessary to definitively assess treatment efficacy.
Limitations in Molecular Classification and Correlates
Lack of Gold Standard and Uniform Methods, Definition, and Criteria
Gold standard methods to assess global genomic and epigenomic changes are important in various clinical studies. Gold standard methods are also important to evaluate performance characteristics of any potential surrogate markers. However, defining such standard is not a trivial task, particularly when one needs to analyze global genomic and epigenomic status in the tumor cell. The use of a limited set of markers (such as MSI, CIMP, and LOH marker panels) can serve as a gold standard with careful validation and correlative studies; however, it is less than ideal. The use of microarray technology may be a solution, but validation and cost of implementing microarrays in clinical laboratory settings are currently difficult issues.
Lack of uniform markers, criteria, and definition in the literature is also a problem. A validation study of markers and criteria requires a large number of markers and samples, thus making such a study difficult to perform. With regard to CIN and CIMP, for example, there are no uniform methods, criteria, and definition in the literature, which makes a comparison between studies challenging. With regard to MSI, the implementation of the NCI marker panel48 has been successful, and a comparison between various studies has become easier.
False Positives and False Negatives
False-positive and false-negative results may obscure true associations. It is important to design a large enough study to have adequate statistical power, even with predictable frequencies of false-positive and false-negative results. False results may also lead to a false association if assay errors are systematic and nonrandom. For example, KRAS mutation and MSI have been shown to be inversely associated,125
but the inverse relation could be caused by abundant reactive lymphocytes commonly seen in MSI-H tumors, which can cause false negative results in KRAS sequencing assays. The inverse association has been confirmed by a more sensitive KRAS pyrosequencing assay,126
particularly in CIMP-high tumors.14
As another example, poor quality paraffin tissue samples, which fail to react with a specific antibody (ie, false negatives), tend to fail to react with another antibody. Thus, negativity of one protein tends to coincide with negativity of another protein even with the absence of any true association. In other words, one should be always be cautious when obtaining a positive correlation between overexpression in two proteins by immunohistochemistry. The presence of internal control in immunohistochemically stained slides may solve this problem to some extent.
Sampling Bias
Sampling bias is an inevitable problem when one can analyze only a finite number of cases. Thus, one should make the best effort to eliminate any source of bias. Compared with a single-hospital-based study, certainly a population-based study or multicenter study is designed to decrease the degree of sampling bias, but it is not always an option to most investigators. Academic hospitals may have more advanced or complicated cases than community hospitals. Racial or geographic bias can be present in any hospital setting. As an example of sampling bias, it is well known that the frequency of HNPCC/Lynch syndrome has been reported to be 3 to 5% in many studies; however, in the general population, the frequency of HNPCC/Lynch syndrome is estimated to be 1 to 3% among all CRCs87
; the reported higher frequency of 3 to 5% is likely due to a combination of geographic and/or referral bias in a setting of academic hospitals.87
Small Sample Size
A small sample size is the cause of a number of problems. Studies with a small number of samples are more susceptible to the adverse effects of sampling bias and false positives/negatives, which may lead to erroneous conclusion. Even if the total number of cases is not small, a number of a particular subtype (especially a rare subtype such as group 2, 3, or 4 in Figure 2
) may become small. If the total number of CRCs in a study is 100, for example, the number of cases in group 2, 3, or 4 is at most 5 to 10 without significant sampling bias. A study has shown an inverse association between MGMT methylation and MLH1 methylation using 110 single-hospital-based CRCs (with only 13 cases showing MLH1 methylation).127
In contrast, a much larger study has shown a positive association between MGMT and MLH1 methylation using 920 population-based CRCs (with 115 cases showing MLH1 methylation).56
Issues in MSI Classification
Despite the presence of the recommended panel of markers (the NCI panel),48
markers used for studies on MSI are still not uniform. This is partly because the NCI panel has only five markers, and an increase in the number of markers increases the accuracy of classification, and because the five NCI markers may not be the best markers for the identification of MSI-H.
Another issue is MSI-L. It is still controversial whether MSI-L represents a distinct phenotype from MSI-H and MSS. We discussed the rationale for the existence of MSI-L (see above, MSI-L versus MSS). Yet, it is very clear that we need better markers for the identification of MSI-L if it really exists. Ideal MSI-L markers, if any, may not be microsatellite markers.
Issues in CIMP Classification
Definition of CIMP and markers and criteria for CIMP diagnosis are still controversial issues. It has been repeatedly shown that BRAF mutation can be a good surrogate marker for CIMP-high independent of MSI status, and originally described MINT markers may not be ideal for identification of CIMP-high tumors with a high BRAF mutation rate.
In addition to methylation markers and criteria, methods of methylation detection are not uniform. Most previous studies on molecular correlates with CIMP in CRC used nonquantitative methylation-specific PCR, which may show positivity in tumors with biologically insignificant low-level methylation and overestimate the frequency of methylation in any markers as well as the frequency of CIMP. In contrast, a quantitative methods such as quantitative methylation-specific PCR (MethyLight) are robust and can reproducibly differentiate high-level from low-level methylation in paraffin-embedded tissue, and low-level methylation is likely due to biological noise because it is not associated with gene silencing.128
The concept of CIMP-low is no less problematic than that of MSI-L. We discussed the rationale for the existence of CIMP-low (see above, CIMP-high versus CIMP-low versus CIMP-0). Yet, it is very clear that we need better markers for the identification of CIMP-low, if it really exists.
Issues in CIN Classification
Lack of standardized definition of CIN is a substantial issue. CIN has been evaluated by many different methods, including metaphase karyotyping, flow cytometric DNA ploidy study, microsatellite LOH analysis, fluorescence in situ hybridization, CGH, array-based CGH, SNP array, and spectral karyotyping. Thus, data in the literature have been based on a diverse array of methods, which makes cross-comparison between studies very difficult. Moreover, markers (chromosomal loci) used for evaluation are also diverse. Each method has advantages and disadvantages. In particular, microsatellite LOH analysis has been in widespread use because of its low cost and easy applicability to paraffin-embedded archival tissue samples. However, it requires normal DNA for comparison with tumor DNA, microsatellite markers are frequently uninformative because of homozygosity, and there are false positives (because of PCR bias and allele drop-out) and false negatives (because of PCR bias and normal cell contamination).
Interpretation of Molecular Correlates or Association Studies
When one has found a positive correlation between two molecules, it is often erroneously concluded that two molecules are pathogenetically linked. However, one should be cautious in interpreting molecular correlates. A significant positive correlation is not synonymous to a causal or pathogenetic link. Unidentified potential confounding factors should also be kept in mind when interpreting molecular correlates.
Future Directions
It should always be kept in mind that molecular classification and correlates in colorectal cancer are still evolving and that the current knowledge only represents our best understanding at the present. Advances in technology in the field of cancer research are rapid and very promising. High-throughput technology for the detection of molecular alterations in tumors will enable us to evaluate molecular correlates in CRC in a much more comprehensive way. However, challenges still exist, including cost issues in newer technologies, inevitable sampling bias, and time required to assess patient outcomes. The latter two will persist even with the state-of-the-art laboratory technologies. We can make great efforts to decrease the degree of sampling bias by organizing multicenter studies and population-based studies. Data obtained from molecular correlate studies will help further understanding of carcinogenesis and help implement cost-effective laboratory tests for various purposes: prognostication, treatment decision making, and genetic risk assessment for family members.
Conclusions
Molecular classification and molecular correlates will continue to be very important in colorectal cancer research. Studying molecular correlates can 1) provide clues to pathogenesis, 2) propose or support the existence of a new molecular subtype, 3) alert investigators to potential confounding factors in association studies, and 4) suggest surrogate markers in clinical or research settings. In molecular classification, global genomic or epigenomic classifiers such as MSI, CIN, and CIMP are increasingly important, and investigators should always try to analyze such global cellular status in colorectal cancer whenever feasible.
Notes Added in Proof
Recently, Shen et al (Ref.129 ) have described "CIMP2" associated with KRAS mutation, separate from CIMP1 and CIMP-negative in colorectal cancer. This CIMP2 appears to have overlapping features with CIMP-low.14, 56, 86, 92
We have recently shown that density of methylation in CIMP-high-specific promoters (CACNA1G, CDNK2A, CRABP1, IGF2, MLH1, NEUROG1, RUNX3, and SOCS1) is generally lower in CIMP-low tumors than in CIMP-high tumors on average, even after adjusting for the number of methylated loci, suggesting a possibility of different underlying methylation defects in these two types of tumors.130
Acknowledgments
We thank all of the investigators who contributed to this field. We truly regret that we could not cite all of the published papers in this field. S.O. thanks Charles Fuchs, Massimo Loda, Walter Willett, Sue Hankinson, Edward Giovannucci, Gregory Kirkner, Takako Kawasaki, and other laboratory members for their support and assistance in various aspects of this work.
Footnotes
Address reprint requests to Shuji Ogino, M.D., Ph.D., Department of Pathology, Brigham and Womens Hospital, Harvard Medical School, 75 Francis St., Boston, MA 02115. E-mail: shuji_ogino{at}dfci.harvard.edu
S.O. was supported by National Institutes of Health grants P01 CA87969 and P01 CA55075 and in part by the Bennett Family Fund for Targeted Therapies Research and by the Entertainment Industry Foundation through the Entertainment Industry Foundation National Colorectal Cancer Research Alliance.
Accepted for publication August 20, 2007.
References