SurvExpress: an online biomarker validation tool and database for cancer gene expression data using survival analysis

PLoS One. 2013 Sep 16;8(9):e74250. doi: 10.1371/journal.pone.0074250. eCollection 2013.

Abstract

Validation of multi-gene biomarkers for clinical outcomes is one of the most important issues for cancer prognosis. An important source of information for virtual validation is the high number of available cancer datasets. Nevertheless, assessing the prognostic performance of a gene expression signature along datasets is a difficult task for Biologists and Physicians and also time-consuming for Statisticians and Bioinformaticians. Therefore, to facilitate performance comparisons and validations of survival biomarkers for cancer outcomes, we developed SurvExpress, a cancer-wide gene expression database with clinical outcomes and a web-based tool that provides survival analysis and risk assessment of cancer datasets. The main input of SurvExpress is only the biomarker gene list. We generated a cancer database collecting more than 20,000 samples and 130 datasets with censored clinical information covering tumors over 20 tissues. We implemented a web interface to perform biomarker validation and comparisons in this database, where a multivariate survival analysis can be accomplished in about one minute. We show the utility and simplicity of SurvExpress in two biomarker applications for breast and lung cancer. Compared to other tools, SurvExpress is the largest, most versatile, and quickest free tool available. SurvExpress web can be accessed in http://bioinformatica.mty.itesm.mx/SurvExpress (a tutorial is included). The website was implemented in JSP, JavaScript, MySQL, and R.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Biomarkers / analysis*
  • Databases, Factual
  • Gene Expression Profiling
  • Humans
  • Internet*
  • Neoplasms / metabolism*
  • Neoplasms / mortality*
  • Survival Analysis

Substances

  • Biomarkers

Grants and funding

The authors are thankful for the financial support from Cátedra de Bioinformática CAT220 at ITESM (Tecnológico de Monterrey) and CONACyT grants 83929 and 140601. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.