|
|||||
|
|
||||||
Originally published as JCO Early Release 10.1200/JCO.2007.13.8545 on June 30 2008 © 2008 American Society of Clinical Oncology. Prediction of Survival in Multiple Myeloma Based on Gene Expression Profiles Reveals Cell Cycle and Chromosomal Instability Signatures in High-Risk Patients and Hyperdiploid Signatures in Low-Risk Patients: A Study of the Intergroupe Francophone du Myélome
From L'Institut National de la Santé et de la Recherche Médicale, U892, University of Nantes; University, Hospital, Hematology Laboratory, Nantes; Centre de Lutte contre le Cancer Nantes Atlantique, Nantes-Saint Herblain; and University, Hospital, Hematology Department, Toulouse, France Corresponding author: Stéphane Minvielle, MD, INSERM U892 Institute of Biology, 9 Quai Moncousu, Nantes 44093, France; e-mail: sminvielle{at}chu-nantes.fr
Purpose Survival of patients with multiple myeloma is highly heterogeneous, from periods of a few weeks to more than 10 years. We used gene expression profiles of myeloma cells obtained at diagnosis to identify broadly applicable prognostic markers. Patients and Methods In a training set of 182 patients, we used supervised methods to identify individual genes associated with length of survival. A survival model was built from these genes. The validity of our model was assessed in our test set of 68 patients and in three independent cohorts comprising 853 patients with multiple myeloma. Results The 15 strongest genes associated with the length of survival were used to calculate a risk score and to stratify patients into low-risk and high-risk groups. The survival-predictor score was significantly associated with survival in both the training and test sets and in the external validation cohorts. The Kaplan-Meier estimates of rates of survival at 3 years were 90.5% (95% CI, 85.6% to 95.3%) and 47.4% (95% CI, 33.5% to 60.1%), respectively, in our patients having a low risk or high risk independently of traditional prognostic factors. High-risk patients constituted a homogeneous biologic entity characterized by the overexpression of genes involved in cell cycle progression and its surveillance, whereas low-risk patients were heterogeneous and displayed hyperdiploid signatures. Conclusion Gene expression–based survival prediction and molecular features associated with high-risk patients may be useful for developing prognostic markers and may provide basis to treat these patients with new targeted antimitotics.
Despite recent progress in the management of patients, multiple myeloma (MM) remains an incurable disease, with a 5-year survival rate not exceeding 40%. However, this uniform evolution hides wide clinical and pathophysiologic heterogeneities. Patients experience survival periods of a few months to more than 10 years. Identifying risk groups with a high predictive power could contribute to select patients for personalized treatment. For 25 years, serum β2-microglobulin (Sβ2M) was considered as the single most powerful prognostic factor.1 Recently, a multivariate meta-analysis of clinical and laboratory data obtained from 10,750 previously untreated patients with myeloma validated Sβ2M as a major prognostic factor to define poor risk and identified serum albumin as the second dominant prognostic factor. Both parameters were combined in a novel staging system referred to as the International Staging System (ISS).2 However, the ISS was not powerful enough to recognize the highest-risk patients.2 The assessment of chromosomal abnormalities, such as t(4;14) or del(17p), has been shown to be useful in refining identification of these patients.3 Although powerful, genetic abnormalities targeted small subsets of patients and do not account for all the heterogeneity in the clinical outcome. Molecular signatures capturing gene expression data recently emerged as powerful tools to identify patients at high risk of death, as well as to define biologic entities associated with risk groups.4 Toward identification of low-risk and high-risk patients with MM and better understanding of their malignant plasma cell biology, we analyzed the gene expression profiles of 250 newly diagnosed patients with MM treated with high-dose therapy. We developed an expression-based multivariate model that predicts disease outcome. We found that myeloma cells from high-risk patients overexpressed genes involved in mitosis and its surveillance, whereas the profile of myeloma cells from very low-risk patients was more heterogeneous and included hyperdiploid MM gene signatures.
Patients Two hundred fifty patients with myeloma at diagnosis with at least 500,000 available bone marrow CD138+ plasma cells were identified from the 983 patients analyzed for del(13) by interphase fluorescence in situ hybridization (FISH) in the hematology department at University Hospital in Nantes, France, between April 2000 and October 2003. The absence of bias between genomic and nongenomic subsets was verified (Appendix Table A1 and Fig A1, online only). Approval for this study was obtained from medical ethical committees and institutional review boards of the University Hospitals of Nantes, Toulouse, and Grenoble. Informed consent was provided according to the Declaration of Helsinki. All patients received high-dose chemotherapy with stem-cell transplantation according to the IFM 99 clinical trials, conducted by the Intergroupe Francophone du Myélome (IFM). Fifty-five patients died because of their disease during follow-up. Overall survival (OS) was calculated from the diagnosis.
Sample Collection, Plasma Cell Purification, and Total RNA Extraction and Purification
cDNA Microarray Technology
Data Availability
Interphase Cytogenetics Analysis
Statistical Analysis Training-test mode was chosen for internal validation in a 3/4-1/4 manner to obtain sufficiently large groups. Training (182 patients) and test (68 patients) sets were stratified according to death and known confounders. Absence of significant difference between training and test sets for baseline characteristics was verified before gene determination to avoid confounding, which could occur if the main prognostic factors were not equally distributed among the two sets (Table 1).
Raw intensities of the microarrays data were transformed into log intensities before proceeding with statistical analysis. Univariate Cox analyses were conducted on the 7,508 gene probes that passed the normalization procedure. Development of the gene expression–based prognostic model is described in the Appendix. Univariate and multivariate Cox regression analyses were performed for each bioclinical variable and the IFM 15-gene model in the original data set. Bootstrap and permutation techniques were used to assess the statistical significance obtained with the original data sets. Details of this procedure are presented in the Appendix.
External Validation Data Sets
Gene Set Enrichment Analysis
Gene Expression–Based Model for Predicting Survival in MM Gene expression profiles were generated for 250 newly diagnosed patients enrolled in the IFM 99 trials (Table 1) using single-channel cDNA arrays. Univariate Cox analysis was performed on the 7,508 gene probes to identify genes associated with survival in the training set of 182 patients. The 15 most stable genes were used to construct a survival model (Appendix Table A2, online only). Principal component analysis was performed on the 15-gene list. The first principal component was computed and used as an expression risk score for each patient. The cases were ranked according to their score: quartiles 1 and 2 defined very low risk, quartiles 1, 2, and 3 defined low risk, and finally quartile 4 delineated high-risk patients. The IFM 15-gene model was highly predictive of survival in IFM data set (Figs 1A through 1C). The rates of survival at 3 years in low- and high-risk groups in our whole cohort were 90.5% (95% CI, 85.6% to 95.3%) and 47.4% (95% CI, 33.5% 60.1%), respectively. The 15-gene high-risk signature has been validated in three independent data sets comprising 853 newly diagnosed and relapsed myeloma patients (Figs 1D and 1F and Appendix). When competed with the UAMS 17-gene prognostic model,11 the IFM model does not remain a significant independent variable in the UAMS TT2 data sets and does not identify the early disease-related deaths in the UAMS TT3 data set (Table 2 and Appendix Table A3, online only)11; however, our 15-gene model remains a significant independent variable in the unrelated (non-IFM and non-UAMS) data sets examined (Table 2 and Appendix).12-14 To cross-validate the results, the UAMS 17-gene model was applied to the IFM data set. Despite differences in microarray platforms (Affymetrix v academic cDNA arrays) and in clinical trials (single center v multiple centers), univariate analysis revealed that the UAMS model successfully identifies high risk in patients with myeloma enrolled in IFM trials (P = .001; hazard ratio, 2.43; Appendix), and when adjusted to the IFM model, the UAMS signature is near statistical significance (P = .075).
Comparison of Our Gene Expression–Based Survival Predictor and the Most Important Prognostic Variables Univariate and multivariate Cox proportional hazards analyses were performed on the entire cohort to determine the relative prognostic values of known prognostic variables and our model for OS determination (Table 3). Other than the 15-gene risk categories, Sβ2M 5.5 mg/L, platelets less than 130 109/L, del(13), and t(4;14) were correlated with survival in univariate analysis.
In the multivariate analysis, estimation of hazard ratio for death was 6.06 (P < .001) in the high-risk group, indicating that the 15-gene model was by far the most powerful independent prognostic factor. Not surprisingly, the other variables retained in the analysis were Sβ2M 5.5 mg/L and t(4;14). Actually, Sβ2M value was recently confirmed as the strongest clinical prognostic variable that delineated high-risk group (ISS 3) by the ISS system2 and was also identified, with t(4;14), as an independent powerful prognostic factor in a survival model built with a large series of patients enrolled in the IFM clinical trials.3
Kaplan and Meier analysis of the 250 patients with myeloma according to 15-gene risk score showed that Sβ2M
Correlation Between Extreme-Risk Entities and FISH Abnormalities The high-risk group (15-gene model quartile 4) was significantly associated with deletion of 13q, deletion of 17p, gain of 1q, and translocation t(4;14), whereas the very low-risk group (15-gene model quartiles 1 and 2) was significantly correlated with hyperdiploid status determined by FISH3 (Appendix Table A5, online only).
Biology Associated With Extreme-Risk Entities The algorithm identified a large number of significant gene sets elevated in high-risk patients. We first selected the top 15 showing a normalized enrichment score of greater than 2, then we extracted the core genes (leading-edge subset) that contributed the most to the enrichment score in each gene set. We selected the genes shared by at least three leading-edge subsets to define a high-risk enriched signature (Fig 3).
The list of the genes contains a large array of cell-cycle regulated genes16 that are involved in the basic and essential cell cycle processes, such as cell-cycle control, DNA replication, DNA repair, DNA packaging, mitosis, and spindle-assembly checkpoint (SAC) (Appendix Table A7, online only). A striking feature of this signature was the overexpression of key regulators that normally maintain faithful segregation of chromosomes,17 such as MAD2 and BUBR1, two of the four key members of the mitotic checkpoint complex18,19; ZWINT, a linker between mitotic checkpoint signaling and the structural kinetochore20; survivin, a member of chromosome passenger complex required for chromatin-induced microtubule stabilization and spindle assembly21; UBCH10, an ubiquitin-conjugating enzyme involved in the termination of SAC signal22; PRC1, a microtubule-associated protein that targets PLK1 to the central spindle at the metaphase-anaphase transition23; PARP1, a multifunctional enzyme that catalyzes the formation of poly(ADP-ribose) polymers on acceptor proteins (including BUB3, a component of the mitotic checkpoint complex,24 and Aurora-B, a kinase of the chromosome passenger complex)25; STMN1/OP18, a microtubule-destabilizing protein inhibited by Aurora-B during the process of spindle assembly26; and SMC4, a member of the condensing complex that plays a critical role in mitotic chromosome segregation.27 These results suggest that malfunction of the mitotic network activity in MM cells from aggressive disease could maintain a moderate rate of chromosomal instability (CIN) and aneuploidy,28 a hallmark of MM.29 To test this hypothesis, we compared the high-risk enriched signature with a 70-gene signature of CIN associated with poor clinical outcome in six cancer types, including lymphoma, lung adenocarcinoma, mesothelioma, medulloblastoma, glioma, and breast cancer.30 There was significant overlap of 11 genes, including PRC1, MAD2L1 (which encodes MAD2), and RFC4, with known functional association with CIN (Appendix Table A7). In addition, CIN70 signature was a significant predictor of OS in our data set (P = .03) but lost its significance when adjusted to the 15-gene model (P = .71). These results support our hypothesis that CIN was associated with adverse prognosis in MM but was less powerful than the 15-gene signature for risk stratification. Because many of the high-risk enriched genes are cell-cycle regulated, we first compared functional categories represented in this signature with those found in previously described proliferation index (PI) signatures known to be associated with poor prognosis in various type of tumors,31 including MM.32 We found an overlap of 12 genes, with a vast majority of them (eight of 12) involved in DNA replication and maintenance, whereas only two of 12 are involved in the mitotic process and its control (Appendix Table A7). Then we evaluated the prognostic impact of PI signatures adjusted or not to the 15-gene signature in our cohort and in UAMS cohort. PI signatures were generally significant predictors of survival, but in both cohorts evaluated, the 15-gene signature remained significant in a Cox model given the PI signatures (Appendix and Appendix Table A8, online only). These data showed that our 15-gene model is not simply a surrogate measure of tumor cell proliferation index. In very low-risk patients, the Gene Set Enrichment Analysis algorithm revealed only three significant gene sets, thus reflecting their complex biology (Appendix Table A6). Interestingly, two of the three gene sets are related to hyperdiploidy in MM. The first set included 20 enriched genes of the top 50 overexpressed genes that categorized the hyperdiploid group among MM patients.32 Although the third set identified eight genes (including EEF2, EIF3S7, EIF4B, RPL5, RPL6, RPL17, RPL22, and RPS9) involved in protein biosynthesis, such signature was recently reported as a common feature among hyperdiploid MM patients.12 Our results suggest that protein biosynthesis gene expression signature predominates in patients with good prognosis.
Despite providing valuable prognostic information, ISS and cytogenetic abnormalities have a limited ability to predict individual patient outcomes. Using gene expression profiling in a top-down supervised approach, we developed a 15-gene model that accurately predicts survival in newly diagnosed patients with MM treated with high-dose therapy. High-risk patients have a 6.8-fold hazard ratio (95% CI, 3.92 to 11.73; P < .001) of death compared with low-risk patients.
The IFM 15-gene model improves ISS prognostication. High-risk patients according to ISS staging (representing 22.5% of the cases) had a median OS of 45 months, whereas high-risk patients according to the 15-gene model had a median OS of 31 months. The 15-gene model is also more discriminating than the FISH model3 for stratifying MM patients according to survival. Very low-risk patients (approximately 50% of the cases) defined either by our model or by the absence of t(4;14) and del(17p) with low Sβ2M level had a rate of survival at 3 years of 95% and 88%, respectively. Similarly, high-risk patients defined either by our model (25% of the cases) or defined by a 17p(
IFM 15-gene and UAMS 17-gene models, when applied to their respective data sets, are extremely powerful prognostic factors, with hazard ratio superior to 5 in high-risk patients. Despite difficulties in performing cross-validation because of our academic cDNA arrays that cover half of the genome and that generate data sets not readily available, we found that both gene signatures remain powerful prognostic factors with hazard ratio greater than 2; when compared with each other, they do not remain independent (IFM model, P = .26) or near independent (UAMS model, P = .07). Finally, when applied to independent data sets, both the 15-gene and the 17-gene models seem to be equivalent and independent in newly diagnosed (P = .02) and relapsed myeloma (P In addition to its strong prognostic interest, our model defines two subpopulations with extreme risks and thereby provides the opportunity to better understand the molecular mechanisms underlying resistance to treatment. The very low-risk group is a heterogeneous entity enriched in two hyperdiploid signatures11,12 that contain more than 50% of hyperdiploid patients (defined by three-probe FISH analysis). In contrast with the study of Chng et al,12 the protein biosynthesis-enriched signature is apparently not restricted to the predicted hyperdiploid low-risk patients; we are currently analyzing genome-wide genetic alterations in low-risk patients by using single nucleotide polymorphism arrays to clarify their genetic status (unpublished data). Patients with high-risk MM are characterized by the overexpression of genes involved in multiple phases of the entire cell cycle that is a common feature in cancer cells.31 The novelty of our findings is the specific enrichment of mitotic-phase members in patients with aggressive disease. Overexpression of these genes was described in various human cancers and was correlated with poor prognosis. The putative impact of their overexpression in plasma cell tumorigenesis is supported by a recent study reporting that MAD2 overexpression in mice leads to CIN (characterized by numerical and structural abnormalities) and tumorigenesis.33 However, because overexpressed genes of the high-risk enriched signature are implicated in cellular processes other than mitotic phase, such as apoptosis, DNA replication, DNA repair, transcription regulation, and cellular transport, one cannot exclude that these mechanisms could also contribute to tumor initiation and progression in MM. Our hypothesis that the SAC activity network is perturbed in plasma cells of high-risk patients, thereby maintaining mild CIN that will facilitate tumorigenesis and drug resistance, leads to a targeted therapeutic model in which SAC inactivation might be an efficient way to provoke plasma cell death. This hypothesis is reinforced by a recent study showing that an Aurora kinase inhibitor (VX-680) induced apoptosis of plasma cells from aggressive disease.34 In conclusion, this work provides a powerful tool for identifying high risk in patients with newly diagnosed and relapsed myeloma, as well as subgroups that do not benefit from new agent such as bortezomib, and a rational basis for treating these patients with targeted antimitotics.
The author(s) indicated no potential conflicts of interest.
Conception and design: Olivier Decaux, Florence Magrangeas, Pascal Jézéquel, Philippe Moreau, Régis Bataille, Loïc Campion, Hervé Avet-Loiseau, Stéphane Minvielle Provision of study materials or patients: Michel Attal, Jean-Luc Harousseau, Philippe Moreau, Hervé Avet-Loiseau Collection and assembly of data: Olivier Decaux, Laurence Lodé, Florence Magrangeas, Wilfried Gouraud, Stéphane Minvielle Data analysis and interpretation: Olivier Decaux, Laurence Lodé, Florence Magrangeas, Catherine Charbonnel, Régis Bataille, Loïc Campion, Hervé Avet-Loiseau, Stéphane Minvielle Manuscript writing: Loïc Campion, Hervé Avet-Loiseau, Stéphane Minvielle Final approval of manuscript: Olivier Decaux, Loïc Campion, Hervé Avet-Loiseau, Stéphane Minvielle
cDNA Microarray Technology Clone library. One-channel DNA Unité Mixte de Génomique du Cancer (UMGC) microarrays were constructed from 17,134 Expressed Sequence Tag cDNA clones representing 11,250 unique genes (based on Homo sapiens: UniGene Build #196, issued in October 2006). UMGC array represented a second generation of academic array that combined (1) the first generation containing approximately 5,400 EST clones (IMAGE cDNA clone collection) representing unique genes with known or suspected roles in processes important in immunology or cancer (Bertucci F, Nasser V, Granjeaud S, et al: Hum Mol Genet 11:863-872, 2002; Magrangeas et al6; Bertucci F, Finetti P, Rougemont J, et al: Cancer Res 264:8558-8565, 2004), (2) approximately 500 clones from IMAGE cDNA clone collection corresponding to transcripts that distinguished germinal center B-like diffuse large B-cell lymphoma from activated B-like diffuse large B-cell lymphoma (Alizadeh A, Eisen B, Davis B, et al: Nature 403:503-511, 2000; Rosenwald et al,4) and (3) approximately 16,000 nonredundant clones from Human UniGene Set –RZPD 3 in order to get a maximum coverage of the genome. Clones were purchased from RZPD/ImaGenes (http://www.rzpd.de); they have been selected from the IMAGE clone collection on very stringent criteria (see "Quality Control") and are bioinformatically linked to the representative oligonucleotides present on Affymetrix GeneChips. In addition, the link between this Set and the GenomMatrix Database (http://www.genome-matrix.org) that is based on Ensembl annotation provides a comprehensive collection of data available for the respective genes, and it is continuously updated. Spot size. The consistency of spot size was controlled in two previous technical studies from our previous laboratory: In the first study, Granjeaud S et al (Granjeaud S, Nguyen C, Rocha D, et al: Genet Anal 12:151-162 1996, 1996; Fig 10) showed that spot size distribution is narrow, although the range of intensities spans two orders of magnitude. In the second study, Bertucci et al9 (Fig 1) again showed that spot surface does not vary significantly over a wide range of hybridization intensity. Reproducibility. Obtaining valid and robust quantitative data from large-scale expression studies requires attention to a number of parameters. Routine quality control procedures were performed at various key checkpoints. RNA sample quality control: The integrity of RNA preparations were assessed with 2100 Bioanalyzer (Agilent Technologies Palo Alto, CA) using Agilent 2100 Expert software. The RNA Integrity Number algorithm calculated the RNA integrity for each sample. The average RNA Integrity Number was 9.1 (range, 6.9 to 10). cDNA clone quality control: For each clone: Vitality checked Cloning vector (pT7T3D or Lafmid BA with a modified polylinker) Host bacteria (Escherichia coli DH10B) Sequence-verified by RZPD or by Millegen Inc. (http://www.millegen.com) Clone position close to the 3' end Absence of detectable repeat sequence Clone length between 500 bp and 1.2 kbp Polymerase chain reaction (PCR) amplification cDNA clones in 96-well microtiter plates with upstream and downstream universal primers. Clones that did not show one band or with an unexpected size on the agarose gel electrophoresis were rejected. Array validation: Array quality is verified by hybridizing all membranes with an oligonucleotide (vector probe) corresponding to a polylinker sequence present in all of the PCR products spotted. Array image inspection: all vector probe images were visual inspected to verify absence of artifacts. Arrays with high/low intensity spots, scratches, high regional, or overall background were excluded. Quantification of the vector probe hybridization signal provided also a value corresponding to the amount of DNA fixed in each spot of the microarray. These data were used to correct the intensity values obtained with a complex probe (see "Normalization Procedure"). Target labeling: Bioanalyzer electropherogram after cRNA synthesis was used to estimate quantity and to display the nucleotides size distribution. The average size was approximately 1000 nt. Samples with nucleotides size inferior to 800 nt were excluded. Basic data analysis after sample hybridization: Visual array inspection Negative controls (The pT7T3D or Lafmid BA vector, poly(A) sequences and PCR reactions without template) Poly(A)RNA control (Arabidopsis thaliana cytochrome c554 clone that is devoid of similarities to human DNA sequences is used to monitor the success of the entire labeling process, independent from the starting material quality). Poly(A)RNA control should be positive. UMGC quality control (Appendix Table A9) Reproducibility: The reproducibility of complex probe hybridizations was verified in comparing the signal intensities for two arrays produced in different batches, hybridized with complex probes prepared with RNA samples from the same patient. The signal intensities for all spots on one array were plotted against those from the other array; four duplicates are shown in Figures A2A through A2D. Genome coverage. Our academic array had genome coverage of 43% and probes are equally distributed across the genome without any gap at the cytogenetic level (Table A10).
Spotting, probe synthesis, and hybridization.
The cDNA clones were amplified in 96-well microtiter plates with universal primers. PCR products were spotted onto two Hybond N+ filters (GE Healthcare Life Science; Chalfont St Giles, United Kingdom, and Uppsala, Sweden) using Microgrid II Biorobotics (Genomic Solutions, Huntingdon, United Kingdom). Probe synthesis and hybridization were conducted as follows: between 0.4 and 1 µg of total RNA was used as template to generate cDNA bearing T7 promoter, then antisense RNA (aRNA) was generated by in vitro transcription using MEGAscript technology according to the Ambion protocol (Ambion Inc, Austin, TX). An aliquot of 2 µg of aRNA was then primed with random hexaprimers and reverse transcribed with a mix of cold dNTPs and [ Normalization procedure. Normalization procedure is summarized in the Table A11. For each membrane, the data were normalized to retain for subsequent analysis of only well-measured spots. Statistical analysis: Formulation of a robust prognostic model. Univariate Cox analyses were conducted on the 7,508 gene probes that passed the normalization procedure (see above). As usual in microarrays data analysis, great stringency (P value < .001) was needed to establish criteria for gene selection, giving a 50-gene list. Then, to maximize reduction of overfitting, resampling (n = 1,000; 80% to 20%) and permutation (n = 1,000) were used in the training set. This confirmed P values less than .005 for 28 genes from the 50-gene list. Finally, to verify stability of this 28-gene list, we determined a survival predictor (high-risk v low-risk) by means of BRB-Arrays Tool for each of 100 random training/test sets. The 100 predictors gene lists were intersected and the 15 genes found in at least 50% of the predictors (mean, 75%; range, 56% to 97%) were kept. All these genes had individual false discovery rate less than 1.5% (mean, 0.9%; range, 0.0001% to 1.4%). Principal component analysis was performed to summarize with minimal loss the 15-gene list information. The first principal component was used to build the Intergroupe Francophone du Myélome (IFM) 15-gene model by the following equation: risk score = (UMGC_01969cr x 0.27578783) + (UMGC_08943cr x 0.26987655) + (UMGC_05764cr x 0.29530369) + (UMGC_01066cr x 0.31490195) – (UMGC_02996cr x 0.13137903) + (UMGC_07324cr x 0.17772804) + (UMGC_06566cr x 0.38697337) + (UMGC_06118cr x 0.30371178) + (UMGC_00460cr x 0.25043791) – (UMGC_17217cr x 0.29483393) + (UMGC_10992cr x 0.19243758) – (UMGC_11702cr x 0.2491429) – (UMGC_09916cr x 0.17822457) + (UMGC_11580cr x 0.21255699) + (UMGC_11582cr x 0.21956366). The "cr" represents the standardized value of each particular probe. This equation was used to calculate a risk score for each patient in the training group. Patients were ranked according to this score and divided into quartiles. The same procedure was used to calculate a risk score for each patient in the test group. These patients were then classified into risk groups according to cutoff values calculated from the training group only. Univariate and multivariate Cox regression analyses were performed for each bioclinical variable and the IFM 15-gene model in the original data set. To avoid overfitting, bootstrap and permutation techniques were used to assess the statistical significance obtained with the original data sets. First, each bioclinical parameter was tested on the 1,000 bootstrap samples randomly created and the number of bootstrap P values lower than .05 was counted. This process was then also applied to 1,000 permuted samples randomly created. Only parameters with at least 500 bootstrapped P values less than .05 and with permutation P value less than .05 were retained in multivariate analysis.
Validation of Our 15-Gene High-Risk Signature in Independent Data Sets Adjusted for the University of Arkansas School of Medical Sciences 17-Gene Model Univariate analysis of IFM risk model applied to TT2 and TT3 UAMS datasets showed that the high-risk group defined by our 15-gene model was significantly associated with inferior survival in TT2 and TT2 + TT3 groups but does not identify high-risk patients in TT3 group alone (Table A12), and multivariate analysis showed that our 15-gene model is not independent vis-à-vis 17-gene UAMS model (Table A13 and Table 2). Why the predictive ability of our model is different in TT2 and TT3 cohorts? Two differences could explain this observation: (1) the first is the treatment: patients enrolled in TT3 were treated with bortezomib, whereas patients in TT2 did not; (2) the second is more technical: the median follow-up period available in the data set is 40 months in TT2 cohort and 13 months in TT3 cohort. Given that our model was built to identify genes for which expression is linked to death during the follow-up period of the IFM cohort (median follow-up, 35 months) rather than selecting genes linked to early death (as Arkansas group did), in validation studies, our model is more powerful in data sets with median follow-up period of 3 years or more, as is the case for TT2 and Mayo Clinic cohorts. Presently, it is difficult to know whether lack of identification of early disease-related deaths is because our high-risk signature is overcome by bortezomib addition or due to the design of our predictive model or both. Reanalysis of the updated TT3 data set will provide a definitive answer to this important question. Mayo Clinic data set: In the Mayo Clinic data set (Chng WJ, Kumar S, Vanwier S, et al: Cancer Res 67:2982-2989, 2007; Chng WJ, Kuehl WM, Bergsagel PL, et al: Leukemia 2007 [Epub ahead of print]), available in Gene Expression Omnibus, data GEO accession number GSE6477, from 71 patients with newly diagnosed multiple myeloma treated with high-dose melphalan and stem-cell transplantation, relevant biologic and clinical information was available for 57 patients. Data set was performed on Affymetrix platform (U133A chip). Of our 15-gene model, we found matches for 12 genes on U133A, and of the UAMS 17-gene model, we found exact matches for 15 genes (Table A12). On the basis of the expression of both of these sets of genes, we calculated a log2 ratio score and identified a group of high-risk patients (16%) using either IFM or UAMS models. Kaplan-Meier analysis revealed that the high-risk group defined by the 12 genes originating from our 15-gene model was significantly associated with inferior survival (median survival, 12.2 months v 52.2 months) in the Mayo Clinic data set. Multivariate analysis revealed that when compared with the UAMS model, the IFM model remains a significant independent variable. Furthermore, the predictive power of both models is equivalent (Table A14 and Table 2). Patients Experiencing Relapse. We applied our model to a data set from Mulligan et al,14 available in Gene Expression Omnibus, data GEO accession number GSE9782 (December 6, 2007). Relevant biologic and clinical information was available for 264 cases of patients who experienced relapse treated either with single-agent bortezomib (188 samples) or high-dose dexamethasone (76 samples). Data set was performed on Affymetrix platform (U133A/B chip). Given that the 15 genes of our model were present on U133A/B chip (Table A12), we directly applied our model to the data set. Ranking and stratification procedures were identical to that performed in our test group. To apply the UAMS model to this data set, we calculated a risk score based on the log2 ratio of the expression values of the 16 genes that are in common within both chips (Table A12) and identified a group of high-risk cases (13.5%) similarly to study of Zhan et al (Zhan F, Barlogie B, Mulligan, et al: Blood 111:968-969, 2008). Our model identified a high-risk group (quartile 4) within patients who experienced relapse with significantly shorter survival times (P < .001; hazard ratio = 2.24). This model identified also high-risk disease in patients treated with bortezomib (P < .001; hazard ratio = 2.32) but was unsuccessful in identifying high risk in patients treated with high-dose dexamethasone (P = .189; hazard ratio = 1.61). This analysis validates the strong prognostic value of our gene expression signature in relapsed disease and predicts outcome for a specific novel therapy. The reason why our 15-gene signature is negated by addition of high-dose dexamethasone is unknown. When compared with the UAMS model, our 15-gene model remains a highly significant independent variable in the whole cohort and in the subgroup of patients treated with bortezomib (Tables A15, A16, and A17 and Table 2; Fig A3). Cross-Validation. Given that our academic cDNA arrays do not cover the whole genome, we found matches for 10 of 17 genes of the UAMS signature (Table A12). On the basis of the expression of these 10 genes, we calculated a log2 ratio score, ranked the patients according to their score, and identified a group of high-risk patients (25%) in the IFM cohort. Kaplan-Meier analysis showed that the high-risk group defined by the 10 genes originating from UAMS 17-gene model was significantly associated with inferior survival in IFM data set (Fig A4). Despite differences in the microarray platforms, the UAMS model is cross-validated in IFM data set. When compared with the IFM model, the derived UAMS model shows a tendency to remain an independent variable (Table A18). The absence of overlap among the genes of the two models may be explained by technical differences, composition of the microarrays used, and different statistical procedures used to build predictive models as it was previously reported in breast cancer (Van de Vijver MJ, Yudong HE, Van't Veer L, et al: N Engl J Med 347:1999-2009, 2002; Wang Y, Klijn JG, Zhang Y, et al: Lancet 365:671-679, 2005) and in diffuse large B-cell lymphoma (Rosenwald et al4 and Shipp MA, Ross KN, Tamayo P, et al: Nat Med 8:68-74, 2002).
Gene Set Enrichment Analysis
Four required data inputs were generated: (1) a complete table of log2-ratio expressed genes for all samples, (2) a mapping file for identification of our probes spotted on Nylon membrane, (3) a catalog of functional gene sets (c2) from Molecular Signature Database (MSigDB, version 2 January 2007 release, www.broad.mit.edu/gsea/msigdb/msigdb_index.html), and (4) a phenotype file, to organize classes between very low-risk patients (Q1 and Q2 according to risk score calculation) and high-risk patients (Q4) for comparison. Default parameters were used throughout with the exception that inclusion gene set size was set between 5 and 200 and the phenotype was permutated 1,000 times. Gene sets that met the FDR
Comparison of Our 15-Gene Model Versus GEP-Derived Proliferation Index Table A19 shows the number of genes shared by the three PI signatures. The PI signature score for each patient was calculated according to Bergsagel et al (Bergsagel P, Keuhl M, Zhan F, et al: Blood 106: 296-303, 2005).
IFM Members
We thank Magali Martin, Emilie Ollivier, and Nathalie Roi for excellent technical expertise.
published online ahead of print at www.jco.org on June 30, 2008 Supported in part by grants from the International Myeloma Foundation, the Ligue contre le Cancer (Equipe Labellisée), and the Cancéropôle Grand Ouest. L.C., H.A.L., and S.M. contributed equally to this work. Authors disclosures of potential conflicts of interest and author contributions are found at the end of this article.
1. Bataille R, Durie BG, Grenier J, et al: Prognostic factors and staging in multiple myeloma: A reappraisal. J Clin Oncol 4:80-87, 1986[Abstract] 2. Greipp PR, San Miguel J, Durie BG, et al: International staging system for multiple myeloma. J Clin Oncol 23:3412-3420, 2005 3. Avet-Loiseau H, Attal M, Moreau P, et al: Genetic abnormalities and survival in multiple myeloma: The experience of the Intergroupe Francophone du Myelome. Blood 109:3489-3495, 2007 4. Rosenwald A, Wright G, Chan WC, et al: The use of molecular profiling to predict survival after chemotherapy for diffuse large-B-cell lymphoma. N Engl J Med 346:1937-1947, 2002 5. Avet-Loiseau H, Facon T, Grosbois B, et al: Oncogenesis of multiple myeloma: 14q32 and 13q chromosomal abnormalities are not randomly distributed, but correlate with natural history, immunological features and clinical presentation. Blood 99:2185-2191, 2002 6. Magrangeas F, Nasser V, Avet-Loiseau H, et al: Gene expression profiling of multiple myeloma reveals molecular portraits in relation to the pathogenesis of the disease. Blood 101:4998-5006, 2003 7. Nguyen C, Rocha D, Granjeaud S, et al: Differential gene expression in the murine thymus assayed by quantitative hybridization of arrayed cDNA clones. Genomics 29:207-216, 1995[CrossRef][Medline] 8. Bernard K, Auphan N, Granjeaud S, et al: Multiplex messenger assay: Simultaneous, quantitative measurement of expression of many genes in the context of T cell activation. Nucleic Acids Res 24:1435-1442, 1996 9. Bertucci F, Bernard K, Loriod B, et al: Sensitivity issues in DNA array-based expression measurements and performance of nylon microarrays for small samples. Hum Mol Genet 8:1715-1722, 1999 10. Simon R, Peng A: BRB-ArrayTools. Bethesda, MD, 2003. http://linus.nci.nih.gov/BRB-ArrayTools.html 11. Shaughnessy JD Jr, Zhan F, Burington B, et al: A validated gene expression model of high-risk multiple myeloma is defined by deregulated expression of genes mapping to chromosome 1. Blood 109:2276-2284, 2007 12. Chng WJ, Kumar S, Vanwier S, et al: Molecular dissection of hyperdiploid multiple myeloma by gene expression profiling. Cancer Res 67:2982-2989, 2007 13. Chng WJ, Kuehl WM, Bergsagel PL, et al: Translocation t(4;14) retains prognostic significance even in the setting of high-risk molecular signature. Leukemia 22:459-461, 2008[CrossRef][Medline] 14. Mulligan G, Mitsiades C, Bryant B, et al: Gene expression profiling and correlation with outcome in clinical trials of the proteasome inhibitor bortezomib. Blood 109:3177-3188, 2007 15. Subramanian A, Tamayo P, Mootha VK, et al: Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A 102:15545-15550, 2005 16. Whitfield ML, Sherlock G, Saldanha AJ, et al: Identification of genes periodically expressed in the human cell cycle and their expression in tumors. Mol Biol Cell 13:1977-2000, 2002 17. Musacchio A, Salmon ED: The spindle-assembly checkpoint in space and time. Nat Rev Mol Cell Biol 8:379-3793, 2007[CrossRef][Medline] 18. Li R, Murray AW: Feedback control of mitosis in budding yeast. Cell 66:519-531, 1991[CrossRef][Medline] 19. Hoyt MA, Totis L, Roberts BT: S. cerevisiae genes required for cell cycle arrest in response to loss of microtubule function. Cell 66:507-517, 1991[CrossRef][Medline] 20. Kops GJ, Kim Y, Weaver BA, et al: ZW10 links mitotic checkpoint signaling to the structural kinetochore. J Cell Biol 169:49-60, 2005 21. Sampath SC, Ohi R, Leismann O, et al: The chromosomal passenger complex is required for chromatin-induced microtubule stabilization and spindle assembly. Cell 118:187-202, 2004[CrossRef][Medline] 22. Reddy SK, Rape M, Margansky WA, et al: Ubiquitination by the anaphase-promoting complex drives spindle checkpoint inactivation. Nature 446:921-925, 2007[CrossRef][Medline] 23. Neef R, Gruneberg U, Kopajtich R, et al: Choice of Plk1 docking partners during mitosis and cytokinesis is controlled by the activation state of Cdk1. Nat Cell Biol 9:436-444, 2007[CrossRef][Medline] 24. Saxena A, Saffery R, Wong LH, et al: Centromere proteins Cenpa, Cenpb, and Bub3 interact with poly(ADP-ribose) polymerase-1 protein and are poly(ADP-ribosyl)ated. J Biol Chem 277:26921-26926, 2002 25. Monaco L, Kolthur-Seetharam U, Loury R, et al: Inhibition of Aurora-B kinase activity by poly(ADP-ribosyl)ation in response to DNA damage. Proc Natl Acad Sci U S A 102:14244-14248, 2005 26. Gadea BB, Ruderman JV: Aurora B is required for mitotic chromatin-induced phosphorylation of Op18/Stathmin. Proc Natl Acad Sci U S A 103:4493-4498, 2006 27. Losada A, Hirano T: Dynamic molecular linkers of the genome: The first decade of SMC proteins. Genes Dev 19:1269-1287, 2005 28. Kops GJ, Weaver BA, Cleveland DW: On the road to cancer: Aneuploidy and the mitotic checkpoint. Nat Rev Cancer 5:773-785, 2005[CrossRef][Medline] 29. Zandecki M, Lai JL, Facon T: Multiple myeloma: Almost all patients are cytogenetically abnormal. Br J Haematol 94:217-227, 1996[CrossRef][Medline] 30. Carter SL, Eklund AC, Kohane IS, et al: A signature of chromosomal instability inferred from gene expression profiles predicts clinical outcome in multiple human cancers. Nat Genet 38:1043-1048, 2006[CrossRef][Medline] 31. Whitfield ML, George LK, Grant GD, et al: Common markers of proliferation. Nat Rev Cancer 6:99-106, 2006[CrossRef][Medline] 32. Zhan F, Huang Y, Colla S, et al: The molecular classification of multiple myeloma. Blood 108:2020-2028, 2006 33. Sotillo R, Hernando E, Diaz-Rodriguez E, et al: Mad2 overexpression promotes aneuploidy and tumorigenesis in mice. Cancer Cell 11:9-23, 2007[CrossRef][Medline] 34. Shi Y, Reiman T, Li W, et al: Targeting aurora kinases as therapy in multiple myeloma. Blood 109:3915-3921, 2007 Submitted August 3, 2007; accepted April 24, 2008.
This article has been cited by other articles:
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|||||||||||
|
Copyright © 2008 by the American Society of Clinical Oncology, Online ISSN: 1527-7755. Print ISSN: 0732-183X
|