| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
Published online before print August 9, 2007
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||








From the Departments of Hematopathology,
* Bioinformatics and Computational Biology,
Leukemia,
and Pathology,
The University of Texas M.D. Anderson Cancer Center, Houston, Texas
| Abstract |
|---|
|
|
|---|
| Introduction |
|---|
|
|
|---|
Because sequence analysis of the IgVH genes is expensive, technically demanding, and beyond the scope of most diagnostic laboratories, surrogate markers of IgVH mutation status that can be assessed by flow cytometry or immunohistochemistry have been sought. One initially promising marker was CD38, a transmembrane protein that can be assessed easily by flow cytometry. Expression of CD38 was originally shown to correlate with IgVH mutation status and to have independent prognostic value in multivariate analysis.5
However, CD38 expression and IgVH mutation status are less highly associated than initial reports suggested, and CD38 expression may change over the disease course.6, 7
Currently, the best surrogate marker of IgVH mutation status is ZAP70, a 70-kd T-cell receptor
-chain-associated protein tyrosine kinase; the expression of ZAP70 in CLL cells has been shown to correlate well with unmutated IgVH genes and poor clinical outcome.8, 9, 10, 11
ZAP70 expression can be assessed by a variety of techniques, usually flow cytometry but also Western blot analysis, quantitative real-time polymerase chain reaction (QRT-PCR) assay, and immunohistochemistry.8, 9, 12
Depending on the study, the concordance of ZAP70 expression with IgVH mutation status ranges from 75 to 90%,8, 9
and its expression seems to be stable over time.8, 13
A recent study suggests that ZAP70 expression, measured by flow cytometry, may be a stronger predictor of time to treatment than IgVH mutation status.13
However, the flow cytometry assay for ZAP70 expression remains to be standardized.
Gene expression profiling studies of CLL cells using microarray technology have examined differential gene expression with respect to a wide variety of prognostic factors, including IgVH mutation status, clinical stage, cytogenetic abnormalities, and others.10, 11, 14, 15, 16, 17, 18 These studies have generated long lists of potential prognostic biomarkers, including ZAP70. Regardless of the prognostic factor evaluated, the prognostic subtypes identified in each study seem to be distinguished by a small subset of a few dozen genes. This finding presents us with the opportunity to exploit a new medium-throughput technology, QRT-PCR assays performed using microfluidics cards (MF-QRT-PCR). We recently showed that MF-QRT-PCR is more reproducible and has a wider dynamic range than gene expression profiling using microarrays.19 We now demonstrate that we can construct a reliable and reproducible clinical assay to determine the IgVH mutation status in untreated CLL patients using MF-QRT-PCR and a limited panel of genes. To develop this panel, we re-analyzed the raw data from four previously published microarray studies of CLL that compared cases with mutated and unmutated IgVH genes.10, 11, 14, 15 Using newer statistical methods that we and others have developed, we identified 88 potential prognostic biomarkers linked to IgVH mutation status. We produced a MF-QRT-PCR array containing these candidate biomarkers, along with control genes, and applied it to samples from 29 previously untreated patients with CLL to test whether the differentially expressed genes identified in the original microarray studies could be confirmed in a new patient cohort. Next, we developed predictive models for IgVH mutation status using this biomarker panel and these 29 samples as a training set. Finally, we validated the predictive models in an independent test set of 20 previously untreated patients with CLL. Our data indicate that it is possible to classify cases with 95% accuracy based on the expression of as few as three genes.
| Materials and Methods |
|---|
|
|
|---|
Sequence Analysis of the Immunoglobulin VH Genes
Sequence analysis of the IgVH genes was performed as described previously, with minor modifications.20
The IgVH mutation status of each CLL sample was designated as unmutated if there were less than 2% mutations (>98% homology to germline sequences) or as mutated if there were 2% or more mutations (
98% homology to germline sequences) compared with the germline sequences (in VBASE; http://vbase.mrc-cpe.cam.ac.uk/), as described by Fais et al21
and by Damle et al.5
The VH family and percent homology to the germline sequence are listed in Supplemental Table 1 (at http://jmd.amjpathol.org).
Design and Production of the QRT-PCR Microfluidics Card
To identify candidate biomarkers predictive for IgVH mutation status, we reanalyzed the raw data from four microarray studies that compared the gene expression profiles of IgVH-mutated and -unmutated cases of CLL.10, 11, 14, 15
The first step in this process was our meta-analysis of the data from three CLL microarray studies.22
Based on the re-analysis, we identified 88 candidate biomarkers of IgVH mutation status, listed in Table 1
.
|
|
We printed 384-well custom microfluidics cards that contained gene-specific forward and reverse primers and TaqMan probes to detect the 88 candidate biomarkers listed in Table 1
and the eight candidate endogenous control genes listed above. Each card contained two identical halves. Two CLL samples were processed on each card: one sample added to ports 1 through 4, and the second sample added to ports 5 through 8. Thus, the expression of each gene was assessed in duplicate for both samples on a card.
Microfluidics Card Assay
Custom-printed TaqMan Low Density Array microfluidics cards contain gene-specific forward and reverse primers at a final concentration of 900 nmol/L and a TaqMan MGB probe (6-FAM dye-labeled) at 250 nmol/L final concentration in each well. The assays were designed to span an exon-exon junction. The microfluidics card assays were performed as described previously.19
In brief, 500 µl of cDNA from each sample (20 ng total input RNA/µl) was combined with an equal volume of TaqMan Universal PCR Master Mix (Applied Biosystems, Foster City, CA), mixed by inversion, and spun briefly in a microcentrifuge. After the cards reached room temperature, 100 µl of sample was loaded into each port and spun in a Sorvall Legend centrifuge (Kendro Scientific, Asheville, NC) with Sorvall/Haraeus Custom Buckets (Applied Biosystems) for 1 minute at 1200 rpm. Immediately after centrifugation, the cards were sealed with a Low-Density Array Sealer (Applied Biosystems) to prevent cross-contamination. The final volume in each well after centrifugation was less than 1.5 µl; thus, the final concentration was approximately 10 ng per reaction. The QRT-PCR amplifications were run on an ABI Prism 7900HT Sequence Detection System with a TaqMan Low Density Array Upgrade (P/N 4329012). Thermal cycling conditions were as follows: 2 minutes at 50°C (to activate UNG), 10 minutes at 95°C (activation), 40 cycles of denaturation at 95°C for 15 seconds, and annealing and extension at 60°C for 1 minute.
Statistical Methods
The QRT-PCR data were quantified using the SDS 2.1 software package (Applied Biosystems). Results from each card were quantified separately, using an automatic baseline and a manual threshold of 0.10 to record the cycle thresholds. Statistical analyses, including unsupervised hierarchical clustering, were performed in R, version 2.4, a freely available statistical software package (http://cran.r-project.org/).25
Linear mixed effects models for initial studies to select controls and estimate sources of variation were fit in R using the lme package. Univariate t-tests and tail-rank tests were computed in R using a package we developed for Object-Oriented Micro-array and Proteomic Analysis (http://bioinformatics.mdanderson.org/Software/OOMPA). We applied a variety of feature selection and classification methods to build predictive models of IgVH mutation status from the QRT-PCR data (Table 2)
; all methods were implemented in R by using or extending standard packages.
|
To confirm that the endogenous control genes were expressed at constant levels, we used a mixed-effects model that has been described previously.19
We found that the SD of the residual error was
= 0.4041, which was highly consistent with our estimate of
= 0.3907 in the original study.19
The variability of individual genes is represented in the model by a gene-specific variance,
j. For each gene, the ratio of
j:
should have an F-distribution with 7 and 2222 degrees of freedom. Using an F-test, we determined that one of the endogenous control genes (C3) varied significantly across the samples (F = 2.23, P = 0.029), and one (SKI) was marginally significant (F = 1.73, P = 0.097). Thus, these two genes were eliminated from the set of endogenous control genes. Using the method of Vandesompele et al,24
we found that normalizing to the mean of five endogenous control genes (PGK1, 18S rRNA, GUSB, ECE1, and GAPD) minimized the residual variance.
Sources of Variability in the QRT-PCR Microfluidics Card Assay
Next, we assessed the contributions of different sources of variability (the biological variability between samples, the card-to-card variability, and the well-to-well variability) to the total variability in the system. After normalization, we used separate models for each gene of the form Yikm = µ + Bm + Ci + Eikm, with random effects Cm
N(0,
C2), Bi
N(0,
B2), Eikm
N(0,
E2), where m ranges over cards, i ranges over samples, and k ranges over replicates within a card. In this analysis,
C represents card-to-card variability,
B represents sample-to-sample variability, and
E represents well-to-well variability. The median value of the
E was 0.173, and the 90th percentile was 0.48. These values are consistent with the estimate of
= 0.4041 in the earlier model. The median value of the ratio of
C:
E was 0.47, and the ratios for 83.8% of the genes were less than 1, suggesting that after normalization, the card-to-card variability is almost always less than or equal to the well-to-well variability. In other words, the normalization procedure correctly removed any systematic variability between cards. Finally, the median value of the ratio of
B:
E was 2.9, and the ratios for 86% of the genes were greater than 1. Thus, we found that the primary source of variability in the system arose from biological differences between the samples, and as expected, many of these genes varied significantly between CLL samples.
| Results |
|---|
|
|
|---|
cycle threshold values. Seven of the 29 samples were assayed in duplicate on two different cards. We performed hierarchical clustering based on the expression of all 86 candidate biomarkers that gave usable values, using average linkage and Pearson correlation to define the distance between samples. In the resulting dendrogram, six of the seven duplicate samples clustered as nearest neighbors, with the seventh pair included in the same major branch of the dendrogram (data not shown). This finding gave us greater confidence in the reproducibility of the assay. Duplicates were averaged before proceeding. Finally, we clustered all 29 samples using Pearson correlation and average linkage based on the expression of 86 genes (Figure 1)
|
Development and Validation of a Predictive Model of IgVH Mutation Status
Our next goal was to develop a predictive model of IgVH mutation status using the initial 29 samples as the training data. We constructed 16 different predictive models (Table 2)
and evaluated their performance on an independent test data set of 20 new CLL samples processed on microfluidics cards. All models were fully constructed before the test data were analyzed; ie, predictions were made from all 16 models, whereas the status of the test samples remained blinded.
Each model combined a feature selection method (to determine the genes to include in the model) and a classification method (to determine how to combine the measurements of those genes to make predictions). Classification methods included linear discriminant analysis,26 quadratic discriminant analysis,26 diagonal linear discriminant analysis,27 compound covariate predictors,27 k nearest neighbors,26 classification and regression trees,26 naïve Bayes,26 random forests,28 principal components regression,26 and an ensemble of nearest neighbor classifiers.29 We also used the method of molecular signs, which first uses the tail-rank test30, 31, 32 to identify and convert markers to binary variables (molecular signs) and then classifies samples based on the number of signs present.
Feature selection methods included filters and wrappers33
and methods that combine information from all genes. A wide variety of methods have been proposed for feature selection. The first class of methods ranks features (or genes) by some criterion and retains the top n features. Ranking may use a t-statistic, a P value from a univariate test, the Pearson or Spearman rank correlation of the feature with a desired outcome, the coefficient of determination from a regression, or mutual information. In the machine-learning literature, rank-based methods are often described as "filters," because they are usually used as a preprocessing step. The chief alternative to filters are "wrappers," which attempt to find subsets of features that work well together, evaluating them within the context of the performance of the chosen methods to optimize the parameters of a model classifier. Wrapper methods include greedy algorithms such as stepwise forward selection. Genetic algorithms have also been applied as wrapper methods for feature selection in microarray and proteomics studies. Some classification methods, such as classification and regression trees, embed feature selection as part of the model optimization process; thus, these behave more like wrappers than filters. Using the identification numbers in Table 2
, methods 6, 8, 10, 12, 13, and 14 include all genes in the final classifier. Methods 5, 11, 15, and 16 are wrappers, which use some form of cross-validation (within the training set) to select features based on their performance in a classifier. Methods 1, 2, 7, and 9 are filters, which rank genes based on univariate t-statistics for inclusion in the model, and methods 3 and 4 are filters that rank genes based on the tail-rank statistic.
Nine of the models (indicated by boldface type in Table 2
) had the same performance, correctly classifying 19 of the 20 test samples. Each method used different rules to select the genes included in the predictive model and, therefore, required a different number of genes. The models that used the fewest genes were quadratic discriminant analysis (three genes; ZBTB20, LDOC1, and SEPT10), linear discriminant analysis (four genes; ZBTB20, LDOC1, SEPT10, and LPL), and one of the molecular sign models (seven genes; COBLL1, CRY1, LDOC1, LPL, GFI1, SEPT10, and ZBTB20). A list of the genes included in each model is provided in Supplemental Table 2 (at http://jmd.amjpathol.org). All nine methods correctly classified 11 of 11 IgVH-mutated cases and eight of nine IgVH-unmutated cases, and all nine misclassified the same sample, CLL95. Interestingly, this sample was near the cutoff point for mutational status, with 98.1% homology (IgVH unmutated) to the germline IgVH sequence. All other IgVH-unmutated cases in this study exhibited at least 99.1% homology with the germline IgVH sequence. Seven of the models that we tested had poorer prediction accuracy on the test data, including two of the three tree-based methods.
| Discussion |
|---|
|
|
|---|
If prognosis can be determined from the expression levels of only a few dozen genes, then there is an opportunity to switch from gene expression profiling microarrays to QRT-PCR, a technology that is more reproducible, has a greater dynamic range, is more cost effective, and is already in use in many clinical molecular diagnostic laboratories. In the past, a major drawback of QRT-PCR was its low throughput. However, higher throughput versions of QRT-PCR based on microfluidics technology have overcome this limitation. We have established that this technology gives highly reproducible results over 7 orders of magnitude,19 which is consistent with previously published, conventional QRT-PCR data.34 Recently, similar technology has been used to determine the prognosis of newly diagnosed, early-stage patients with breast cancer. The Oncotype DX assay (Genomic Health, Redwood City, CA) assesses the expression of a panel of 21 genes by QRT-PCR to identify patients with a high risk of recurrence and to identify patients who may benefit from certain types of chemotherapy.35, 36
For our experiments, the 88 genes printed on the microfluidics cards were selected from four microarray studies.10, 11, 14, 15 Based on the results of t-tests on the MF-QRT-PCR data from the 29 training samples, we confirmed the differential expression of 37 of 88 (42%) genes. One possible explanation for the relatively low confirmation rate may be that the differences in expression were too small to be detected with this number of samples. We therefore repeated the t-tests on a joint data set combining the training and test samples (Supplemental Table 3 at http://jmd.amjpathol.org). Two of the genes (RIOK2 and ZBP1) no longer seemed to be significantly different in the joint data set. The remaining 35 genes remained significant, but the differential expression of 13 additional genes (DMD, APOBEC3G, CASP3, PFKP, CTSB, ZA20D3, GSS, ILK, KCNG1, HIST2H2AA, SNX4, ETV5, and CTPS) was confirmed in the joint data set. In most cases, the difference in mean threshold cycles between IgVH-mutated and -unmutated samples was similar in the training set and the joint data set, and in many cases, the difference became smaller in the larger data set, thus confirming our suspicion that the difference was too small to detect in the training set of 29 samples but was detectable in the joint set of 49 samples. Using the joint data, our final confirmation rate was 48 of a total of 88 genes (57%). A few of the remaining genes may still be differentially expressed, but most of them seem to have been false-positive results from the earlier microarray studies.
Using the MF-QRT-PCR assay, we have validated 18 of 31 (58%) genes from our own previous microarray study,14 15 of 16 (94%) genes from the study by Klein et al,15 18 of 41 (44%) from the study by Rosenwald et al,10 six of eight (75%) from the study by Wiestner et al,11 and four of six (67%) identified in other studies. (Because some genes were identified as differentially expressed in more than one microarray study, the sum of the genes contributed by these studies, 102, exceeds 88.) Of the 48 genes that we confirmed, 12 were reported in more than one study, 14 were reported only in our study, six were reported only by Klein et al,15 10 were reported only by Rosenwald et al,10 and six were reported only by Wiestner et al.11 Moreover, each of the three best performing predictive models selected features from two or three different studies. We have shown previously that multiple statistical methods may be required to get a complete list of the differentially expressed genes in a single microarray study.14 The results of the present study suggest that in the same way, multiple microarray studies may be required. In most cases, the lists produced from a microarray study are selected by controlling the false discovery rate. Keeping the false discovery rate small (or equivalently, increasing the specificity) can lead to many false negatives (by decreasing the sensitivity). Thus, it is reasonable to expect that combining gene lists from multiple microarray studies should have a better chance of producing a combination of genes with predictive value, as we found.
In our analysis, we trained 16 different models and evaluated all of the models on the same test data set. All 16 models performed better than chance, with prediction accuracies ranging from 75 to 95%. This finding, along with the unsupervised clustering in Figure 1
, provides strong evidence that the PCR data from the genes selected for inclusion on the microfluidics card contains substantial information that can be used to distinguish IgVH-mutated from -unmutated cases of CLL. Because we evaluated multiple models on the same data set, it is conceivable that the predictive accuracy of the best models (95%) might be overestimated, and the average accuracy of the tested models (90.6%) might be a better estimate of its performance on future data sets. With only 20 test samples, it can be difficult to draw strong conclusions about the relative performance of different models. Both classification and regression tree methods (15 and 16 in Table 2
) had relatively poor performance on the test set. This finding suggests that tree-based methods may be more prone to over-fitting accidental features of the training data that do not generalize well to new data sets.
We identified three models, each of which used at most seven genes to predict IgVH mutation status with at least 90% accuracy. The models that used the fewest genes were quadratic discriminant analysis (three genes; ZBTB20, LDOC1, and SEPT10), linear discriminant analysis (four genes; ZBTB20, LDOC1, SEPT10, and LPL), and the tail-rank test (seven genes; COBLL1, CRY1, LDOC1, LPL, GFI1, SEPT10, and ZBTB20). A total of nine models had 95% accuracy, and all nine misclassified the same sample, CLL95, which was near the cutoff point for mutational status, with 98.1% (IgVH unmutated, VH4–34) homology to the germline IgVH sequence. Of the 49 samples assessed, only three (CLL95, along with CLL75 and CLL119, both of which had 98% homology with germline) were near the cutoff point with respect to mutational status. The other cases classified by sequence analysis as unmutated had at least 99.1% homology to the germline IgVH sequence, and the cases classified as mutated had at most 97.7% homology. The 98% cutoff was established to exclude potential polymorphic variant sequences and to avoid sequence analysis of the corresponding germline sequence. However, there is evidence to suggest that low rates of intraclonal mutation may occur in CLL cases with
98% homology to IgVH germline sequences that would otherwise be classified as IgVH unmutated. Thus, it is possible that CLL95 may, in fact, represent a IgVH-mutated case with a slightly lower rate of somatic mutation.
Taken together, our best predictive models used at most seven genes that were differentially expressed with respect to IgVH mutation status. Genes that were relatively overexpressed in IgVH-unmutated compared with IgVH-mutated cases were CRY1, GFI1, LDOC1, LPL, and SEPT10. Genes that were relatively overexpressed in IgVH-mutated compared with IgVH-unmutated cases were COBLL1 and ZBTB20. In addition to the seven genes that were used to build the best models, we identified many other genes that are differentially expressed with statistical significance based on IgVH mutation status. For example, AICDA and ZAP70 were overexpressed in IgVH-unmutated compared with IgVH-mutated cases. However, our analysis was designed to identify the fewest number of genes required to reliably and reproducibly distinguish between the two groups.
Of the seven genes that we used to build the best models, two (LPL and SEPT10) have been reported previously as prognostic markers in CLL. In a recent study, Oppezzo et al37 demonstrated that expression of LPL is a strong predictor of IgVH mutation status and survival. This finding has been confirmed by several other groups. vant Veer et al38 assessed the prognostic significance of 10 genes known to be differentially expressed between IgVH-mutated and -unmutated cases by standard QRT-PCR assay. In a multivariate analysis, LPL, SEPT10, ZAP70, and ADAM29 were the best predictors of IgVH mutation status. LPL was the strongest predictor of IgVH mutation status in a univariate analysis and the strongest predictor for survival. Expression of SEPT10, which encodes a member of the septin family of cytoskeletal proteins with GTPase activity, was also highly predictive. Our study confirms that expression of LPL and SEPT10 are highly predictive of mutated IgVH genes.
Relatively little is known about the role of the other five genes in the pathophysiology of CLL. CRY1 is one of two cryptochrome genes (CRY1 and CRY2) that are core components of the mammalian circadian clock. Epidemiological studies have raised the possibility that disruption of the circadian clock may increase cancer risk or have an adverse effect on prognosis in cancer patients.39 Interestingly, mice with loss-of-function mutations in cryptochrome (Cry1–/–/Cry2–/–) are more resistant to the toxic effects of cyclophosphamide, including depression of peripheral blood B-cell counts, than their wild-type littermates.40 LDOC1 encodes a leucine zipper protein whose expression is decreased in pancreatic and gastric cancer cell lines.41 It directly induces apoptosis in some human cancer cell lines by interacting with MZF-1.42 COBLL1 is a gene that participates in directing midline development in the gastrulating embryo.43 GFI1 is a zinc finger repressor protein that serves as a cooperating oncogene in rodent models of T-cell lymphomagenesis and also restricts the proliferation of human hematopoietic stem cells.44 ZBTB20 yields a zinc finger protein that is widely expressed in hematopoietic cells. Its function is unknown, but it shows high sequence homology to BCL6.45
| Acknowledgments |
|---|
| Footnotes |
|---|
Supported in part by a grant from the Commonwealth Foundation for Cancer Research and Mr. and Mrs. William H. Goodwin, Jr., and by a grant from the CLL Global Research Foundation.
Supplemental material for this article can be found on http://jmd.amjpathol.org.
Accepted for publication June 26, 2007.
| References |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
E. J. Lewintre, C. R. Martin, C. G. Ballesteros, D. Montaner, R. F. Rivera, J. R. Mayans, and J. Garcia-Conde Cryptochrome-1 expression: a new prognostic marker in B-cell chronic lymphocytic leukemia Haematologica, February 1, 2009; 94(2): 280 - 284. [Abstract] [Full Text] [PDF] |
||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |