by Lisa M. Rimsza, Joseph M. Unger, Margaret E. Tome, Michael L. LeBlanc
Gene expression profiling yields quantitative data on gene expression used to create prognostic models that accurately predict patient outcome in diffuse large B cell lymphoma (DLBCL). Often, data are analyzed with genes classified by whether they fall above or below the median expression level. We sought to determine whether examining multiple cut-points might be a more powerful technique to investigate the association of gene expression with outcome. Methodology/Principal Findings
We explored gene expression profiling data using variable cut-point analysis for 36 genes with reported prognostic value in DLBCL. We plotted two-group survival logrank test statistics against corresponding cut-points of the gene expression levels and smooth estimates of the hazard ratio of death versus gene expression levels. To facilitate comparisons we also standardized the expression of each of the genes by the fraction of patients that would be identified by any cut-point. A multiple comparison adjusted permutation p-value identified 3 different patterns of significance: 1) genes with significant cut-point points below the median, whose loss is associated with poor outcome (e.g. HLA-DR); 2) genes with significant cut-points above the median, whose over-expression is associated with poor outcome (e.g. CCND2); and 3) genes with significant cut-points on either side of the median, (e.g. extracellular molecules such as FN1). Conclusions/Significance
Variable cut-point analysis with permutation p-value calculation can be used to identify significant genes that would not otherwise be identified with median cut-points and may suggest biological patterns of gene effects.