| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
Technical Advances |
From Veridex LLC, a Johnson and Johnson Company, Warren, New Jersey
Abstract
Gene expression signatures have the ability to serve in both prognostic and predictive capacities in patient management. The use of RNA as the starting material and the lability of this analyte, however, dictate that tissues must be snap-frozen or stored in a solution that can maintain the integrity of the RNA. We compared pairs of snap-frozen and RNAlater preservative-suspended tissue from 30 such paired lymph node-negative breast tumors and 21 such paired Dukes B colon tumors. We assessed the correlation of gene expression profiles and prediction of recurrence based on two prognostic algorithms. Tissues stored in RNAlater preservative generated expression profiles with excellent correlation (average Pearson correlation coefficients of 0.97 and 0.94 for the breast and colon tumor pairs, respectively) compared to those produced by tissues that were snap-frozen. The correlation in the prediction of recurrence was 97% and 95% for the breast and colon tumor pairs, respectively, between these two types of tissue handling protocols. This novel finding demonstrates that prognostic signatures can be obtained from RNAlater preservative-suspended tissues, an important step in bringing gene expression signatures to the clinic.
DNA microarray-based gene expression profiling has emerged as a powerful tool for target gene discovery,1, 2 molecular tumor characterization,3, 4 patient prognosis,5, 6, 7, 8, 9, 10 and prediction of drug therapy response.11, 12 Significant advances have been made in the computational and statistical analysis of microarray datasets,1, 13, 14 novel fabrication methods for arrays,15, 16, 17 and in the target preparation (RNA amplification and labeling processes).18, 19, 20 Previous studies demonstrating the prognostic value of microarrays5, 6, 7, 9, 10 have used snap-frozen tissue, and the use of tissues stored in an RNA preservative in conjunction with a prognostic algorithm has not been examined. If microarrays and their associated protocols are to reach the clinical setting, researchers in this area will need to turn their attention to methods of sample acquisition and the effect these methods have on the prognostic and predictive power of microarray data.21
For example, microarray protocols will have to consider the fact that the patient sample often needs to be shared with the pathologist and that liquid nitrogen tanks are not available in most operating rooms. It is difficult to preserve the diagnostic and prognostic characteristics of surgical specimens while ensuring RNA integrity in these samples. At many clinical centers, tissue obtained from surgery may undergo variable periods of storage at ambient temperatures. However, the RNA obtained from such specimens may not be intact by the time it is frozen for subsequent array studies.22 One solution to this problem is storage of the tissue in a solution containing an RNA preservative until it can be frozen. RNAlater (Ambion, Inc., Austin, TX) preservative is an attractive preservative because it offers the advantage of preserving tissue integrity while preventing RNA degradation.
The effect of RNAlater preservative on gene expression levels has been investigated in previous studies. Grotzer and co-workers23 compared total RNA isolated from tumors stored for 7 days at ambient temperature in RNAlater preservative to RNA isolated from snap-frozen tissue. Florell and co-workers24 analyzed mRNA purified from RNAlater preservative-suspended tissues. Ellis and co-workers25 examined core needle breast biopsies that were snap-frozen or stored in RNAlater preservative. More recently, Alfredson and co-workers26 isolated RNA from tissues stored in RNAlater preservative, and Roos-van Groningen and co-workers27 examined the utility of RNAlater preservative in renal cortical tissue. These studies, however, did not systematically compare large numbers of paired tumors (stored in RNAlater and snap-frozen) and most used cDNA arrays of lower densities.
These five studies serve as a foundation on which we designed a more thorough investigation of the effect of RNAlater preservative on gene signatures. We have previously reported on prognostic algorithms developed using 74 Dukes B colon cancer patients and 286 lymph node-negative breast cancer patients.9, 10 In both cases, the algorithms were designed to identify patients with an increased probability of recurrence from the respective cancer. Our original studies used snap-frozen samples. For the reasons outlined above, we chose to investigate the performance of our algorithms in tissue samples that were suspended in RNAlater preservative. In contrast to the earlier RNAlater preservative studies discussed above, our study uses commercially available Affymetrix U133A arrays (Affymetrix, Santa Clara, CA), which are now in widespread use and RNA amplification to assess a total of 51 matched pairs of tissues (snap-frozen and RNAlater preservative-suspended tissue from the same patient) from colon and breast tumors. We demonstrate the concordance between gene expression profiles obtained from tissues preserved in RNAlater preservative and those obtained from tissues that were snap-frozen. We demonstrate the correlation in both breast cancer and colon cancer prognostic algorithms when tissues were snap-frozen or preserved in RNAlater preservative.
Materials and Methods
Tissue Samples and RNA Isolation
Matched pairs of snap-frozen and RNAlater preservative-suspended tissues were obtained from Genomics Collaborative Inc. (Cambridge, MA) and Proteogenex (Los Angeles, CA) after being prospectively collected at multiple institutions with institutional review board approval. Snap-frozen samples (but not RNAlater-preserved tissue) were verified by pathology to contain at least 70% tumor cell content. The RNAlater-preserved tissue was directly adjacent to the piece of tumor that was snap-frozen, and the section used for pathology verification was located between the two pieces of tissue, creating a pair in which the RNAlater-preserved tissue was a mirror of the piece of tumor that was snap-frozen. For tissues stored in RNAlater, the tissue was placed in a 5-ml tube at 10:0.5 (v:v) ratio of RNAlater to tissue (thinly sectioned). The suspension was mixed well, equilibrated overnight at 4°C, and frozen at 80°C the next morning before shipping. Frozen tissues were allowed to thaw before homogenization. Breast tumor (60 mg) or colon tumor (30 mg) were homogenized, and the homogenate was subjected to an RNeasy purification (Qiagen, Valencia, CA). Total RNA was isolated, and the yield and quality were determined via UV spectrophotometric (A260 and A280) reading and Agilent Bioanalyzer electropherogram (Agilent, Palo Alto, CA), respectively.
RNA Amplification, Labeling, Hybridization, and Detection
RNA was amplified and labeled using an Eberwine amplification method.28
Briefly, 2 µg of total RNA were converted to double-stranded cDNA followed by an in vitro transcription reaction. Biotinylated UTP was incorporated during the in vitro transcription. cRNA (15 µg) was fragmented and hybridized according to standard protocols (www.affymetrix.com). Detection used streptavidin-phycoerythrin on a GeneChip Fluidics Station 450, and images were scanned on a GeneChip Scanner 3000.
Data Analysis
Pearson correlation coefficients were determined for each tumor pair as follows. Each pair was sorted according to the absent, marginal, or present calls associated with the frozen sample of each pair. Genes that were present in at least one member of the pair were retained, and other genes were removed from the analysis. This filtering left an average of
11,700 comparisons for the 30 breast tumor Pearson analyses and 12,000 comparisons for the 21 colon tumor Pearson analyses. The average value for each array was then calculated by determining the arithmetic mean of all of the remaining probes. This average value was then scaled to a target intensity such that the average value for all arrays was now equal to 200. The scaling factor used to generate an average value of 200 was then multiplied by every intensity on the array (ie, a different multiplication factor was used for each array). These scaled intensity values were then used in the Pearson correlations. Log2 ratios of the fold change (M values)29
were generated using the same data set as used above. For each pair, M values were entered into MiniTab to generate the box plots. Microarray data has been submitted to the NCBI/GenBank GEO database (accession number: GSE3726; http://www.ncbi.nlm.nih.gov/projects/geo/).
Clustering and Principal Component Analyses
To reduce background noise, genes with extremely low expression were removed from the dataset. Namely, a gene was retained if it had at least two present calls in the dataset. Before the clustering, the expression signal for each gene was divided by its median expression level in the samples. Hierarchical clustering was performed on both samples and genes using software GeneSpring 6.1 (Silicon Genetics, Redwood City, CA). Pearson correlation was used for the measurement of similarity. Principle component analysis was conducted using Partek Pro software (Partek, Inc., St. Louis, MO). Covariance was used as the measurement.
Relapse Hazard Score (RHS)
Expression values for each gene were calculated by using Affymetrix GeneChip analysis software MAS 5.0. The RHS was used to determine each patients risk of recurrence. Details on the statistical method used to generate the RHS are provided in previous publications.9, 10
Briefly, the score was defined as the linear combination of weighted expression with the standardized Cox regression coefficient as the weight. For colon samples, patients whose scores were equal to or greater than 100 were classified in the high risk of relapse group whereas patients whose scores were less than 100 were predicted as the low risk of relapse group. For breast samples, a cutoff of 0 was used to segregate the high risk of relapse group from the low risk of relapse group.
Results
Gene Expression Profiles Generated from Snap-Frozen or RNAlater Preservative-Suspended Tissues Demonstrate Strong Correlations
We did not observe any significant differences in RNA yield or quality when we compared RNA isolated from snap-frozen or RNAlater-preserved tissues. For the colon tumors, the average yield (µg RNA/mg tissue) and 28S:18S ratio were 1.3 and 1.5, respectively, for the snap-frozen tissues and 1.4 and 1.5, respectively, for the RNAlater-preserved tissues. For the breast tumors, the average yield (µg RNA/mg tissue) and 28S:18S ratio were 0.8 and 1.6, respectively, for the snap-frozen tissues and 0.7 and 1.6, respectively, for the RNAlater-preserved tissues.
Because microarray data could provide a more detailed comparison, we determined the correlation in global expression levels in each matched pair (one tissue that was suspended in RNAlater preservative and a corresponding tissue from the same patient that was snap-frozen). We hybridized amplified, labeled cRNA from the 60 breast tumor RNAs and from the 42 colon tumor RNAs onto Affymetrix U133A arrays and determined the Pearson correlation coefficient for each of the 30 breast tumor pairs and for each of the 21 colon tumor pairs (Figure 1)
. To assess whether the Pearson correlation coefficients were in the same range as would be expected from independent sample preparations, target preparations, and hybridizations from the same snap-frozen tissue, we also obtained an independent set of six stage II colon tumor samples. We isolated RNA and prepared cRNA from each of these six samples in triplicate, each time starting with the tissue, for a total of 18 cRNA preparations. Each of the cRNA preparations was hybridized to an Affymetrix U133A array to assess the level of variability due to the entire process. In this manner, three pairwise comparisons could be made for each of the six samples. As seen in Figure 1
, the colon tumor replicates demonstrate very strong correlations, with a median Pearson correlation coefficient of 0.99. The breast tumor pairs and the colon tumor pairs also generated very strong correlations, with median Pearson correlation coefficients of 0.98 and 0.96, respectively. We conclude that RNAlater preservative-suspended tissues generate gene expression profiles that demonstrate an excellent correlation to those produced by their snap-frozen counterparts and that the correlation is similar to what is observed when process variability is interrogated.
|
Because our prognostic algorithm uses the log2 of the hybridization intensity9, 10
as do other methods,30
a significant difference in hybridization intensities between snap-frozen tissues and their RNAlater preservative-suspended counterparts could prevent us from using our algorithm in conjunction with tissues stored in such a preservative because inaccurate RHS would be generated. To address this concern, we calculated the ratio of the hybridization signal generated from the snap-frozen tissue to that generated from its RNAlater preservative-suspended counterpart for every gene in the breast tumor comparisons and in the colon tumor comparisons. We also calculated these ratios for the colon tumor replicate comparisons. The ratios were converted to a log2 scale and reported as the M value.29
As seen in Figure 2A
, the replicate comparisons show ratios very close to unity (a value of 0 on the log2 scale). The median M value from the comparisons was essentially 0 (average of 0.002). These data are nearly identical to those presented in a recent study examining the reproducibility of the RNA amplification and hybridization processes.19
|
Lastly, the median M value in each of the colon tumor comparisons (Figure 2C)
was essentially 0 (average of 0.02). We also repeated these comparisons for four matched colon tumor pairs (Figure 2C
, comparisons 19 to 22). No significant differences were observed in the interquartile range in the M values of the repeats with the exception of comparison 2.
As a further test of how much additional variability the sample acquisition protocol introduces, we determined how many genes showed fold-changes greater than twofold. We found that, for the comparisons made from the colon tumor technical replicates, the comparisons made from the breast tumor-matched pairs, and the comparisons made from the colon tumor-matched pairs, a median of 5%, 6%, and 10%, respectively, of the genes showed fold-changes greater than twofold. Finally, we determined the overlap in differentially expressed genes when colon and breast tumors that were snap-frozen and the corresponding colon and breast tumors that were suspended in RNAlater preservative were used to generate lists of genes that were differentially expressed (between the two tissues). We performed this analysis for two pairs of colon and breast tumors that were snap-frozen and the corresponding two pairs of colon and breast tumors that were suspended in RNAlater preservative (total of four lists of differentially expressed genes). In the two comparisons, we found 70.1% and 70.4% of the genes that were twofold or more up- or down-regulated, respectively, in the tissues suspended in RNAlater preservative were also up- or down-regulated by twofold or more in the corresponding snap-frozen tissues. We conclude that hybridization signals and lists of differentially expressed genes generated from RNAlater preservative-suspended samples are in good agreement with those generated from their snap-frozen counterparts. Furthermore, multiple sample and target preparations and hybridizations from the same tissue generated similar concordance in hybridization signals and in the percentage of genes that were changed by more than twofold, demonstrating that the RNAlater preservative suspension does not introduce additional variation in expression levels.
We next determined whether hierarchical clustering of the snap-frozen and RNAlater preservative-suspended samples would further underscore the correlation in expression profiles. We used only genes that showed at least two present calls throughout the entire data set (16,012 for breast and 15,484 for colon). The RNAlater preservative-suspended and snap-frozen samples did not form two clusters based on the sample handling protocol. As seen in Figure 3A
, each pair of breast tumor samples (the snap-frozen sample and its RNAlater preservative-suspended counterpart) clusters together, with only a few exceptions (samples 5, 10, 13, and 17). As a further demonstration of the reproducibility of the sample preparation, target preparation, and hybridization, repetitions of samples were included in the clustering (using a smaller overall sample set), revealing that the repeats clustered adjacent to their counterparts (original) for tissues that were snap-frozen and suspended in RNAlater preservative (data not shown). The colon tumor pairs (Figure 3B)
showed 13 (of 22) pairs of samples that clustered next to each other and also showed repeats that were located adjacent to their counterparts (original). We note, however, that nine pairs showed RNAlater preservative-suspended samples that clustered proximal, but not adjacent, to its matched, snap-frozen counterpart, demonstrating that differences within a matched pair of RNAlater preservative-suspended and snap-frozen colon tumor tissue can approach or exceed differences across pairs of samples. The dendrograms were created based on 1 Pearsons correlation; therefore, the length of the branches represents the magnitude of the correlation (ie, the longer the length, the lower the correlation). The reason why the difference in length (ie, difference in correlation) between preservation methods for a single sample and between different tumors was small is that the correlation was calculated based on global gene expression profiles of tens of thousands of genes. With that number of genes, the correlation between different tumor samples was
0.87 whereas the correlation between preservation methods for a single sample was
0.99. The difference of 0.12 appears to be small in plots that were based on a unit of 1 (ie, 10% difference in length).
|
Prognostic Signatures Can Be Used with RNAlater Preservative-Suspended Tissues
The true test of the utility of RNAlater preservative is whether the prognostic algorithm and prediction of recurrence gives the same answer in RNAlater preservative-suspended tissues compared to their snap-frozen counterparts. We therefore computed the RHS9, 10
for each of the breast and colon tumor samples. Because all of these samples were collected prospectively within the last year and recurrence may not be seen for several years,9, 10
we were not able to compare the accuracy of the prediction for the snap-frozen and RNAlater preservative-suspended tissues. However, we were able to compare the correlation in the RHS and in the prediction of recurrence generated from the two tissue types in each pair.
Table 1
shows the correlation for the 30 breast tumor pairs. We found a 97% correlation (29 of 30) in the prediction of recurrence between the RNAlater preservative-suspended and matched, snap-frozen tissues. For samples that were within 5 units of the cutoff, we performed three replicates for each sample type to eliminate variability in tissue sampling and assay reproducibility in the comparison. We then reported the mean value of the RHS in Table 1
. Lastly, the Pearson correlation coefficient of the RHS was 0.90, underscoring the strong concordance in the quantitative output of the algorithm between the two sample types.
|
|
RNA Integrity and the Sample Handling Problem
There are three options for storage of biological specimens before obtaining RNA for microarray experiments. The first method, snap-freezing the tissue, is difficult due to the absence of liquid nitrogen tanks in the operating suite. Freezing the specimen at a later time point is not an option because several studies have reported the effects of ischemia on gene expression levels.31, 32
A second method is formalin fixation and paraffin embedding of the tissue. However, this method results in degradation of the RNA.33
Reverse transcriptase-polymerase chain reaction-based assays can generate data of sufficient quality using RNA isolated from formalin-fixed, paraffin-embedded tissues,34
but use of degraded RNA in microarray-based assays requires further protocol optimization.35, 36
A third method is the use of preservatives such as RNAlater preservative.23, 24, 25, 26, 27
We demonstrate, for the first time, the concordance in a RHS (the numerical output of a prognostic algorithm) and in the prediction of recurrence when using RNAlater preservative-suspended and snap-frozen tissues (Tables 1
and 2)
.
Gene Expression Profiles Generated from Snap-Frozen or RNAlater Preservative-Suspended Tissues Demonstrate Strong Correlations
We demonstrated that the ratios of hybridization signals produced by snap-frozen tissues to hybridization signals produced by RNAlater preservative-suspended tissues were within 1.4-fold (M value of 0.5) for the interquartile range of the data (Figure 2B
and 2C)
. These data mirror data obtained from the same snap-frozen tissue (Figure 2A)
and data obtained from other laboratories.19
These data demonstrate that the same level of concordance in the hybridization signals can be found when two different sample handling protocols (RNAlater preservative versus snap-freezing) are analyzed compared to when the overall process variability (starting from the same tissue) is assessed.
The similarity between snap-frozen tissues and their RNAlater preservative-suspended counterparts was underscored by the hierarchical clustering (Figure 3)
. This analysis demonstrated that RNAlater preservative-suspended and snap-frozen tissues from the same patient are, in the majority of cases, more highly correlated to each other than to tissues from other patients. Our findings that two pieces of a tumor from one patient should cluster together are consistent with those published by others. For example, profiling of breast tumors using cDNA arrays has demonstrated that gene expression patterns in two tumor samples from the same individual were almost always more similar to each other than either was to any other sample.37
Likewise, profiling of malignant melanoma demonstrated that two samples from the same patient have a greater similarity to each other than to any other samples in the set.38
The fact that these findings are identical, regardless of whether the pair is comprised of two snap-frozen tissues from the same individual37, 38
or of a snap-frozen tissue and its RNAlater preservative-suspended counterpart (our study), generates confidence that the same conclusions that can be generated from snap-frozen tissue can also be generated from RNAlater preservative-suspended tissues.
Two Prognostic Algorithms Show Excellent Concordance in Tissues that Were Snap-Frozen or Suspended in RNAlater Preservative
We demonstrate the strong (96%) correlation in the prediction of recurrence when comparing tissues that were snap-frozen or suspended in RNAlater preservative (Tables 1
and 2)
. The sole discrepancy in the colon tumor predictions was a sample in which the RNAlater preservative-suspended and snap-frozen tissues were highly dissimilar, as judged by expression ratios and hierarchical clustering. Further studies will be necessary to compare the accuracy of the prognostic algorithm with both types of sample acquisition protocols. Nevertheless, this is the first demonstration that prognostic signatures can be used with RNAlater preservative-suspended tissues, representing an important step in bringing gene expression signatures to the clinic.
We note that, although the Pearson correlation coefficient for the RHS was higher for the breast cancer samples than for the colon cancer samples, the breast cancer samples, in some cases, showed a larger difference in the RHS (between the RNAlater preservative-suspended and snap-frozen tissues) than did the colon cancer samples. There could be several reasons for this finding. First, intratumor heterogeneity is a common problem in breast cancer,39, 40, 41 and it is possible that, although the RNAlater preservative-suspended and snap-frozen samples were derived from the same tissue, different clonal populations existed in the two pieces of tissue. Second, the breast cancer prognostic signature is composed of 76 genes whereas the colon cancer prognostic signature is composed of 23 genes, and it may be possible that subtle differences between the RNAlater preservative-suspended and snap-frozen samples are magnified when a larger number of genes are involved in the signature. Nevertheless, we have found a strong correlation in both prediction of recurrence and in RHS values for both the colon and breast tumors across samples that were snap-frozen or suspended in RNAlater preservative, demonstrating that RNAlater preservative-suspended tissues can be used in conjunction with prognostic algorithms. This finding will aid in bringing these genomic technologies to the clinical setting.
Acknowledgments
We thank Drs. John Backus and David Atkins for comments on the manuscript.
Footnotes
Address reprint requests to Abhijit Mazumder, 33 Technology Dr., P.O. Box 4920, Warren, NJ 07059. E-mail: amazumde{at}vrxus.jnj.com
Accepted for publication August 17, 2005.
References
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |