| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |












From the Molecular Profiling and Biomarker Discovery/Biomarker Laboratory
*
and Clinical Research and Development,
**
Wyeth Research, Collegeville, Pennsylvania; Molecular Profiling and Biomarker Discovery,
Inflammation Research,
Bioinformatics Client Services,
Clinical Statistics,
¶
and Translational Research,

Wyeth Research, Cambridge, Massachusetts; and Biometrics Research,
||
Wyeth Research, Pearl River, New York
| Abstract |
|---|
|
|
|---|
10% of patients remain classified as indeterminate inflammatory bowel disease even after invasive colonoscopy intended for diagnosis. A molecular diagnostic assay using a clinically accessible tissue would greatly assist in the classification of these diseases. In the present study we assessed transcriptional profiles in peripheral blood mononuclear cells from 42 healthy individuals, 59 CD patients, and 26 UC patients by hybridization to microarrays interrogating more than 22,000 sequences. Supervised analysis identified a set of 12 genes that distinguished UC and CD patient samples with high accuracy. The alterations in transcript levels observed by microarray were verified by real-time polymerase chain reaction. The results suggest that a peripheral blood mononuclear cell-based gene expression signature can provide a molecular biomarker that can complement the standard dia-gnosis of UC and CD. | Introduction |
|---|
|
|
|---|
The ability to quantitate the global expression profiles at the level of RNA using oligonucleotide microarrays has recently been applied to investigate transcriptional signatures present in gastrointestinal tissue obtained from CD and UC patients.3, 4 These studies identified genes involved in inflammatory responses generally up-regulated in IBD and showed that the gastrointestinal tissue transcriptomes obtained from UC and CD patients were quite distinct, with gene sets identified that appear to distinguish UC tissue from CD tissue.
In contrast to biopsies, peripheral blood is a much more accessible tissue source of cells that might be used to distinguish between UC and CD. Circulating peripheral blood mononuclear cells (PBMCs) are responsible for the comprehensive surveillance of the body for signs of infection and disease. PBMCs may therefore serve as a surrogate tissue for evaluation of disease-induced gene expression as a biomarker of disease status or severity.5 Maas and colleagues6 identified PBMC profiles in patients with the autoimmune diseases rheumatoid arthritis, systemic lupus erythematosus, type I diabetes, and multiple sclerosis. We have shown7 that in the context of a nonautoimmune disease, PBMCs obtained from renal cell carcinoma patients also exhibit disease-associated transcriptomes distinct from those of healthy volunteers. Mannick and colleagues8 recently explored expression profiles of PBMCs from seven CD patients and five UC patients with a 2400 gene cDNA microarray and described several genes that appear differentially expressed between these diseases. In the present study, we used oligonucleotide arrays interrogating 22,000 sequences to investigate the transcriptional profiles of circulating PBMCs in a group of 42 healthy subjects and 85 IBD patients with clinical diagnoses of CD and UC. The results suggest that a molecular diagnosis of UC and CD using the transcriptional profiling of PBMC might be possible.
| Materials and Methods |
|---|
|
|
|---|
|
25 and/or a diarrhea rating of
25. Diagnosis of CD for at least 6 months was confirmed by radiological studies, endoscopy with histological examination, or surgical pathology; patients with a diagnosis of CD were included if the diagnosis was confirmed by a biopsy. UC patients had scores from the Physicians Global Assessment of the Mayo Ulcerative Colitis Scoring System ranging from mild to moderate (scores of 1 or 2). The diagnosis of left-sided UC was provided by endoscopy with biopsy, in addition to standard clinical criteria. Proportions of females to males were significantly different between the healthy and IBD populations, but not distinct between the two IBD populations. Neither race (Caucasian versus non-Caucasian) nor age differed significantly between healthy and IBD populations or between the two IBD populations. Investigation of concomitant medication usage between the two IBD populations indicated that neither 5-ASA nor any of the other less-frequently used drugs reported as concomitant medications confounded the comparisons in this study.
Blood Sampling and Processing
Blood samples (8 ml) were collected into Vacutainer cell purification tubes (Becton Dickinson, Franklin Lakes, NJ) at the clinical site and shipped overnight to a central processing lab for PBMC isolation according to the manufacturers recommendations. All PBMCs analyzed in this study were processed within 24 hours after the blood draw. Before RNA purification, complete cell counts were performed on purified PBMCs using a Pentra 60 C + hematology analyzer (ABX, Irvine, CA) to record absolute counts and percentages of neutrophils, lymphocytes, monocytes, eosinophils, and basophils. Cell counts for one PBMC sample from a UC patient were not performed, and this profile was therefore excluded from the analyses of covariance described below. Expression data from this patient were included when developing and testing prediction models. Total RNA was purified from PBMCs using the RNeasy mini column protocol (Qiagen, Valencia, CA).
Oligonucleotide Array Hybridization and Data Reduction
Total RNA (2 µg) was converted to biotinylated cRNA according to the Affymetrix protocol (Affymetrix, Santa Clara, CA). Labeled cRNA (10 µg) was fragmented and prepared for hybridization as previously described.7
Biotinylated cRNA was hybridized to the Affymetrix HG-U133A human GeneChip array as described in the Affymetrix technical manual. Eleven biotinylated control transcripts ranging in abundance from 1:300,000 (3 ppm) to 1:100 (100 ppm) were spiked into each sample before hybridization to function as a standard curve.9
GeneChip MAS 5.0 software was used to evaluate the specific hybridization intensity, compute signal value for each probe set, and make an absent/present call. The signal value for each probe set was then converted to a frequency value representative of the number of transcripts present in 106 transcripts by reference to the standard curve.9
Each transcript was evaluated and included in the study following nonstringent criteria: called present and at or above a frequency value of 10 (10 ppm) in at least one of the samples (healthy, UC, or CD). Sequences (n = 7908) meeting these filtering criteria were used in the analysis.
Analysis of Covariance
Analysis of covariance methods were used to adjust for differences in PBMC cell type composition when testing for differences in mean expression among disease groups. Separate analyses of covariance were run for each transcript, using log-transformed frequency as the response measure. The analysis of covariance model included terms for disease group, gender, neutrophil percent, monocyte percent, and eosinophil percent. In the analysis of covariance, a slope describing the linear relationship between the percentage of the cell type and the expression level for a particular gene was estimated for each cell type, and a t-test was done to determine whether the slope was significantly different from 0 (where a slope of 0 indicates that there is no linear relationship between cell type percent and expression level).
In addition to the overall tests for treatment group differences and cell type regression effects, pairwise comparisons of disease group means adjusted for differences in cell type percentages were performed using two-sided t-tests, with the denominator of the t-statistics derived from the analysis of covariance error term. Finally, because the relative distribution of females and males was also significantly distinct among the disease groups, we included gender in the analyses of covariance. No adjustments of the raw P values produced by the analyses described above were done to account for the large number of statistical tests performed. A fold change filter (1.5-fold) combined with a conservative significance level of
= 0.0001 were used to reduce the incidence of false-positive determinations.
Gene Selection and Supervised Class Prediction
Gene selection and supervised class prediction were performed using Genecluster version 2.0 (http://www.broad/mit.edu/cancer/software/software.html).10
In these analyses only 4228 transcripts meeting a stringent data reduction filter (at least 50% present calls in Crohns or UC samples and at least 50% of the Crohns or UC samples with frequencies greater than 10 ppm) were used. Samples within each group were randomly selected for membership in a training set (75%) or a test set (25%) of profiles. Gene selection was performed using the training set of samples, and the classifier with the fewest genes that exhibited the highest overall accuracy of class assignment in the training set was identified by leave-one-out cross validation. The predictive classification model was then evaluated on samples in the test set, and the overall accuracy of class assignment for samples in the test set was reported.
For gene selection all expression data in both the training set and test set were log-transformed before analysis. In the training set of data, models containing increasing numbers of features (transcript sequences) were built using a two-sided approach (equal numbers of features in each class) with a S2N similarity metric that used median values for the class estimate. PBMC profiles from CD patients and UC patients were compared using a binary approach. Predictive gene classifiers containing between 2 and 200 genes in steps of two were evaluated by leave-one-out cross validation to identify the smallest predictive model yielding the most accurate class assignments. Prediction of class membership was performed using a weighted voting algorithm.
Ingenuity Pathway Analysis
The Ingenuity pathway analysis (IPA) tool (Ingenuity, Mountain View, CA) was used to annotate the disease-associated genes obtained from analyses of covariance. Annotations on canonical pathways and functional categories were retrieved for these gene lists from the Gene-By-Gene View and/or using the Search IPKB feature.
Real-Time Polymerase Chain Reaction (PCR) Confirmation of Microarray Results
A total of 45 ng of each PBMC RNA sample was reverse-transcribed in a 96-well plate in a 100-µl reaction using the High Capacity cDNA Archive kit (Applied Biosystems, San Diego CA). The reaction was incubated at 25°C for 10 minutes and then 37°C for 2 hours and stored at 80°C until amplification. To amplify and quantitate relatively the levels of transcripts, predesigned, gene-specific TaqMan probe and primer sets (TaqMan gene expression assays, Applied Biosystems) corresponding to the GenBank accession numbers for genes in the 12 gene classifier were used. Real-time PCR for each transcript of interest was performed in 96-well fast block optical reaction plates in a 25-µl reaction volume (containing 1x TaqMan Fast Universal Master Mix, 1x TaqMan gene expression assay, and 2.25 ng of cDNA) using an ABI 7900HT sequence detection system (Applied Biosystems, San Francisco, CA). Default 7900 fast block cycle conditions were as follows: 95°C for 20 seconds, 40 cycles of 95°C for 1 second, and 60°C for 20 seconds. Ct values for each amplification were recorded for each target gene and the housekeeping genes ß2-microglobulin, ß-actin, 18S, and GAPDH. The differences between cycle thresholds for target genes and each of the four reference genes in each of the samples were calculated (
Ct), and the average fold change in expression between UC and CD was calculated by the following formula: average fold difference = 2 raised to the power of (
CtUC
CtCD).
| Results |
|---|
|
|
|---|
|
By the analysis of covariance, the levels of 220 transcripts were greater than 1.5-fold different between Crohns and healthy PBMCs and possessed an unadjusted P value in the pairwise comparison based on the analysis of covariance of less than 0.0001, and the levels of 120 transcripts were significantly different in UC and healthy PBMCs using the same criteria as above. Forty-five of these sequences were differentially expressed in both UC and CD and these common transcripts changed in the same direction in both diseases compared to healthy levels (Table 3)
.
|
|
The canonical gene pathways bearing the greatest likelihood of significant overrepresentation are summarized for each comparison in Figure 1A
. In this analysis transcripts involved in prostaglandin metabolism were significantly overrepresented in the CD gene signature, whereas transcripts encoding proteins involved in apoptosis and B-cell signaling appear overrepresented in the UC signature. Figure 1B
summarizes the diverse functional categories encompassed by the transcripts differentially expressed in CD relative to healthy controls. Major functional categories up-regulated in CD PBMCs included enzymes involved in prostaglandin metabolism, transcription regulators, and transmembrane receptors including several integrin isoforms. Finally, Figure 1C
summarizes the abundant overrepresentation of immunoglobulin constant regions that was unique to the UC PBMC expression signature.
|
|
|
|
Real-Time Reverse Transcriptase (RT)-PCR Confirmation of Microarray Observations
Despite the classifiers accuracy for nearest-neighbor-based class assignment in a test set of PBMC samples, the average fold changes of transcripts in the CD/UC classifiers were relatively low. We therefore performed real-time PCR to confirm the relative expression observed by Affymetrix microarray technology for CD and UC samples in this study. We used four separate housekeeping genes for normalization of the target genes (ß2-microglobulin, ß-actin, GAPDH, and 18S rRNA). All CD and UC RNA samples in the study were converted to cDNA using the same reverse-transcription cocktail and procedure. Comparison of average fold changes calculated by microarray and real-time PCR using ß2-microglobulin are presented in Figure 3
, and relative fold changes for all 12 genes using each of the four housekeeping genes as normalizers were extremely concordant (Supplementary Table S3; http://jmd.amjpathol.org/). On the basis of these results, of the 12 transcripts originally identified as CD/UC discriminator genes, only the 28S rRNA fragment appears to have been significantly overestimated by microarray hybridization.
|
| Discussion |
|---|
|
|
|---|
The most highly expressed gene commonly elevated in both IBDs was the protease inhibitor SERPINB2 (also called PAI, plasminogen activator inhibitor, type II). Increased plasminogen activator levels have been reported in mucosal lesions of IBD patients,12 and increased PAI-1 was found in IBD patient plasma. Although distinct from PAI-1, PAI-2 shares enzyme specificity to both u-PA and to a lesser degree t-TA, and elevated PAI-2 levels are reported in rheumatoid arthritis synovial fluid.13 These findings suggest changes in components of the fibrinolytic and coagulation system may contribute to an increased risk for thromboembolic complications and possibly to colitis and bleeding seen in IBD patients.14 A role for PAI-2 in IBD has not been reported, but our study suggests that elevated PAI-2 RNA levels in PBMCs are associated with disease.
Multiple functional classes of transcripts appear specifically up-regulated in PBMCs of CD patients including prostaglandin-metabolizing enzymes, chemokines, and transcriptional regulators. The CD-specific gene profile exhibited a proinflammatory gene expression profile that was not apparent in the UC PBMC profile. Prostaglandin endoperoxide synthase 1 (PTGS1, cyclooxygenase 1) was significantly increased in PBMCs from CD patients, while prostaglandin D2 synthase (PTGDS) was decreased. These effects on the prostaglandin synthetic pathway would be expected to result in increased conversion of arachidonic acid into select prostaglandins. Although prostaglandin content is elevated in lesions of IBD patients,15 very recent evidence suggests that levels of at least one prostaglandin (PGE2) are actually decreased in mononuclear cells of patients with CD,16 and PGE2 is an important modulator of cytokine release from T lymphocytes derived from the gastrointestinal tract.17 Several chemokines (C-X-C ligands 4 and 7, platelet factor 4 variant 1) were up-regulated in CD. Overall there was surprisingly little overlap between transcripts identified as up-regulated in the present set of CD PBMCs and those reported as up-regulated in the seven CD patients analyzed by Mannick and colleagues.8 It is unknown whether this is attributable to the larger number of patients explored in the present study, the larger number of genes interrogated, differences in gene nomenclature, or some confounding factor between these studies. However, the most strongly up-regulated transcript in CD reported by Mannick and colleagues8 encoded a transforming growth factor (TGF)-ß-inducible transcript. In this study TSC-22, a distinct TGF-ß-inducible transcript, was also identified as up-regulated in CD PBMCs. These observations show that up-regulation of TGF-ß signal transduction appears to be evident in CD PBMCs. Constitutive elevation in this pathway could result in down-regulation of Smad-dependent pathways that may inhibit the ability of TGF-ß to terminate immune responses and in turn play a causal role in the pathogenesis of CD.8
It is possible that a portion of the Crohns-associated disease signature may be platelet-derived. Recent evidence has demonstrated that platelets can participate in chronic intestinal inflammation,18 and platelets co-purified to a greater extent with the PBMCs isolated from CD patients in this study (data not shown). Thus, the detection of platelet factor 4 and platelet factor 4 variant 1 in the CD-associated signature could be attributable to elevated levels of co-purified platelets in isolated PBMCs. However, other transcripts among the top 10 nonmitochondrial transcripts reported in platelets19 do not appear in the present CD-associated list of transcripts, suggesting that the levels of these anucleate cells are not the sole source of these transcripts. All of the transcripts in the CD disease signature that have been previously associated with platelets are also expressed at significant levels in purified T cells, B cells, and/or monocytes (M.E. Burczynski et al, unpublished observation), which suggests that transcripts previously associated with platelets can originate from the mononuclear cells that were isolated and profiled in this study.
The UC-specific gene set was dominated by overexpression of immunoglobulin-encoding sequences, reminiscent of the active IgG plasma cell component observed in UC patients.20 This finding is consistent with studies on B-cell receptor gene usage that have demonstrated that infiltrating lymphocytes in UC mucosa are of peripheral rather than mucosal origin.21, 22 IgG1 and IgG4 antibodies predominate in UC, whereas IgG2 antibodies are increased in CD.23 The prevalence of the IgG1 type has recently been explored and shown to be specific to UC and lead to greater opsonization of mucosal bacteria and a feed-forward maintenance of the polymorphonuclear leukocyte respiratory burst in UC.24 One of the transcripts most significantly elevated in UC PBMCs in this study was annotated as immunoglobulin heavy constant gamma 3 (IgHG3). The region encompassed by this IgHG3 qualifier on the Affymetrix chip actually maps (ie, shares 100% nucleotide identity by BLAST) to several sequences ascribed to immunoglobulin heavy constant gamma 1 (G1m marker), and has been identified as a marker of inflamed UC gastrointestinal epithelium.3, 4 These results are consistent with the previous observation that IgG1 levels in serum are significantly increased in UC patients relative to serum levels of IgG1 in patients with CD.25
A significant subset of patients with IBD cannot be classified by current procedures and constitute cases of indeterminate IBD.26, 27, 28 Therefore, one of the main goals of the present study was to determine whether PBMC profiles in patients with UC and CD were sufficiently distinct to enable classification of these diseases. Results of class prediction analysis indicate that a gene signature in PBMCs can accurately discriminate UC and CD samples. Transcriptional differences are not attributable to cellular composition because cellular compositions of PBMCs from patients appear quite similar.
The disease-specific patterns, if prospectively validated in a larger population, may provide the basis for a molecular diagnosis of UC and CD and contribute to the diagnosis of patients classified as indeterminate IBD. It is quite possible that the proposed Th1 and Th2 natures of CD and UC, respectively, are mainly responsible for the differences in this study, and that other Th1- and Th2-based inflammatory diseases may bear similar signatures to those identified for CD and UC. Nonetheless the PBMC profile identified in this study appears to have clinical utility in IBD, because the gene classifier enables discrimination between these closely related disorders that are sometimes indistinguishable.
This study indicates that transcriptional profiles in the circulating monocytes, T cells, and B cells may serve as a sensitive monitor of the organisms physiological state in the context of IBD. As these cells traverse various tissues, a component of the cellular reaction to the microenvironment is a transcriptional response that can be quantitated through profiling. Expression patterns may reflect disease mechanisms that are of primary or secondary responses to disease pathophysiology. PBMCs, due to their transit through the body, may serve as an accessible surrogate monitor of tissues and systems that are not easily surveyed by common medical practices. A key challenge of expression profiling studies conducted in PBMCs will be to extend pharmacogenomic discoveries to clinical application through the development of assays incorporating gene expression for diagnostic purposes.
| Acknowledgments |
|---|
| Footnotes |
|---|
Supplemental material for this article can be found on http://jmd.amjpathol.org/.
Current address of R.L.P.: Expression Profiling Department, Novartis Institute of Biomedical Research, Cambridge, MA.
Accepted for publication August 16, 2005.
| References |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
J. B. Axelsen, J. Lotem, L. Sachs, and E. Domany Genes overexpressed in different human solid cancers exhibit different tissue-specific expression profiles PNAS, August 7, 2007; 104(32): 13122 - 13127. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. A. Greenhall, M. A. Zapala, M. Caceres, O. Libiger, C. Barlow, N. J. Schork, and D. J. Lockhart Detecting genetic variation in microarray expression data Genome Res., August 1, 2007; 17(8): 1228 - 1235. [Abstract] [Full Text] [PDF] |
||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |