The Pediatric Brain Tumor Atlas: building an integrated, multi-platform data-rich ecosystem for collaborative discovery in the cloud
Pediatric brain tumors are the leading cause of disease-related death in children. However, despite large scale data-driven efforts for pediatric cancers by the NIH (e.g. TARGET, Therapeutically Applicable Research To Generate Effective Treatments), public access to large-scale pediatric brain tumor genomic data remains limited. As a result, precision medicine initiatives and clinical trials in pediatric brain tumors are hampered by the absence of publicly available genomic resources that can dynamically inform novel discovery and clinical implementation of genomic and molecular approaches for diagnostic and therapeutic purposes in affected children. The Pediatric Brain Tumor Atlas is a concerted multi-institution effort by the Children’s Brain Tumor Consortium (CBTTC) and the Pacific Pediatric Neuro-Oncology Consortium to characterize and deeply profile a newly defined cohort of >1600 brain tumor samples across diverse histopathologies via a combination of whole exome sequencing, whole genome sequencing, RNA sequencing and limited proteomic analysis. Importantly, the Atlas initiative provides for near real-time integration, dissemination, and sharing of the associated raw and analyzed data through an ecosystem of data discovery platforms. As data of this size and complexity require a bolus of scalable computational power and storage, a new cloud-based collaborative scientific environment termed CAVATICA (cavatica.org) has been launched to support integrative analysis and open access to data alongside shared computation and algorithms that empower users to further integrate and analyse their own uploaded data. Importantly, CAVATICA provides further portal access to dbGaP approved users to TCGA and other NCI datasets hosted by the NCI’s Cancer Genomics’ Cloud. Additionally, it provides for scalable integration of these and additional disease-specific datasets on the platform via transdisciplinary analyses. Currently, one of the biggest barriers and challenges to collaborative research in large datasets is the transfer and processing of ‘big data’. By committing to the rapid release of these large pediatric brain tumor data and their deposition in CAVATICA’s cloud-based environment supporting shared pipelines, computation, and visualizations, PNOC and the CBTTC’s collaborating membership are seeking to provide a centralized, collaborative rapid discovery environment for researchers to engage new discovery and data reuse. In addition to unprocessed genomic, whole genome and RNA sequencing data in CAVATICA, processed annotations and and additional biospecimen querying is enabled for the Pediatric Brain Tumor Atlas via PedcBioPortal (PedcBioPortal.org), a data visualization and analysis application further integrating across additional public and deposited datasets. The combination of large-scale genomic data and integrative cloud-based analytic platforms with what is one of the largest genomic date cohorts to date for pediatric brain tumors serves to define a new paradigm for pediatric cancer research and collaborative discovery.