The cBioPortal for Cancer Genomics: an open source platform for accessing and interpreting complex cancer genomics data in the era of precision medicine
Abstract
The cBioPortal for Cancer Genomics is an open-access portal (http://cbioportal.org) that enables interactive, exploratory analysis of large-scale cancer genomics data. It integrates genomic and clinical data, and provides a suite of visualization and analysis options, including cohort and patient-level visualization, mutation visualization, survival analysis, enrichment analysis, and network analysis. The user interface is user-friendly, responsive, and makes genomic data easily accessible to translational scientists, biologists, and clinicians.
The cBioPortal is a fully open source platform. All code is available on GitHub (https://github.com/cBioPortal/) under GNU Affero GPL license. The code base is maintained by multiple groups, including Memorial Sloan Kettering Cancer Center, Dana-Farber Cancer Institute, Children’s Hospital of Philadelphia, Princess Margaret Cancer Centre, and The Hyve, an open source bioinformatics company based in the Netherlands. More than 30 academic centers as well as multiple pharmaceutical and biotech companies maintain private instances of the cBioPortal. This includes the recently launched cBioPortal instance at the NCI Genomic Data Commons (https://cbioportal.gdc.nci.nih.gov/), and two large cBioPortal instances hosting genomic and clinical data at MSK and DFCI, supporting the MSK-IMPACT and DFCI Profile projects, two of the largest clinical sequencing efforts in the world.
Our multi-institutional software team has accelerated the progress of evolving the core architectural technologies and developing new features to keep pace with the rapidly advancing fields of cancer genomics and precision cancer medicine. For example, we have integrated multi-platform genomics data with extensive clinical data including patient demographics, treatment history, and survival data. We have also developed a patient-centric view that visualizes both clinical and genomic data with annotation from OncoKB knowledge base. In the next few years, the development team will focus on the following areas:
(1) Implementing major architectural changes to ensure future scalability and performance.
(2) New features to support precision medicine, including (i) improved integration of knowledge base annotation, (ii) enhanced visualization of patient timeline, drug response, and tumor evolution, (iii) new patient similarity metrics, (iv) improved support for immunogenomics and immunotherapy, and (v) new visualization and analysis features for understanding response to therapy.
(3) New analysis and target discovery features for large cohorts, including (i) supporting user-defined virtual cohort by selecting samples from multiple studies, and (ii) comparison of genomic or clinical characteristics of two or more selected cohorts.
(4) Expanding community outreach, user support and training, and documentation.