Building FHIR – A New Standard for Data Integrity and Sharing

In the fight against pediatric brain cancer, the most powerful weapons in our arsenal are data, and in vast amounts. Member sites across the Children’s Brain Tumor Network (CBTN) have prioritized the collection and processing of clinical, imaging, and multi-omic data obtained at numerous points along the timeline of a child’s cancer treatment and to-date have gathered over 400 TB donated by more than 4,200 pediatric brain tumor patients and their families.

However, gathering, organizing, and sharing such large amounts of data presents numerous challenges, both technical and practical. Often, data experts, clinical staff, and investigators expend considerable time and effort to manually process patient data for use in childhood brain tumor research. The process has been streamlined, but must be accelerated further.

Now, an exciting step forward has been made across the CBTN’s data ecosystem that will drastically enhance our ability to share data more easily between clinics, labs, institutions, and research areas. Applying a new data exchange model known as FHIR (Fast Healthcare Interoperability Resources), experts are standardizing and automating the cumbersome task of organizing (or harmonizing) vast amounts of clinical and multi-omic data and expanding their availability across the childhood cancer research landscape.

FHIR also represents a broadening of horizons for the impact of patient data donated to CBTN, enabling the integration of pediatric cancer data into the adult medical research space and vice versa. Until now, the huge trove of data available for pediatric research through CBTN hadn’t optimized for use as broadly as adult biomedical data. By standardizing these data, CBTN can make this incredibly valuable resources available to as many labs and researchers as possible, allowing a two-way exchange of priceless insights across pediatric and adult cancers.

When “Big Data” Becomes Too Big

When patient-donated tumor tissue and other biospecimens are molecularly sequenced, the amount of data returned is staggering – a single human genome contains roughly 2.9 billion base pairs of DNA molecules.

What returns to the lab from sequencing centers are huge amounts of raw data that may represent not only whole genome sequences, but also gene variants, RNA sequences, miRNA, whole exome sequences, and more. CBTN data experts and research coordinators take the process a step further, pairing that multi-omic data with information collected at the bedside longitudinally, or across the entirety of a patient’s treatment journey. Much of the data collected during clinical practice, especially, has to be processed and organized manually.

As childhood brain tumor research efforts through CBTN and elsewhere continue to grow, this process presents a major hinderance to accelerating the pace of research. Not to mention the potential for human error that may arise in the process of gathering and harmonizing data from many different sources at multiple points along the treatment timeline.

Not All Data Models are Created Equal

Several effective data models are commonly used throughout the biomedical research space, and each one has their own set of benefits and limitations. This presents an obstacle to accelerated research. Because the biomedical research community does not use a single standardized model for data representation and collection, it can be difficult to share resources easily and swiftly.

Beyond simply the vocabulary being used in certain types of data collection (for example, recording sex/gender, ethnicity, etc.), many of the categories or properties of data collection can differ greatly between research centers and institutions.

Fortunately, a solution has been developed.

Spreading FHIR to Improve Data Accessibility

Developed by Health Level Seven International (HL7) in 2014, FHIR is a framework for uniting some of the strongest data models for collecting and sharing data used in the biomedical field today. Building on the success of what came before it, FHIR presents new opportunities to experts across institutions, international borders, and research areas to more effectively collaborate and drive world-changing discoveries.

This framework for drawing upon the very best methodologies in data processing, harmonization, and exchange is what drew data experts at the Children’s Hospital of Philadelphia’s Center for Data Driven Discovery in Biomedicine (D3b), Operations Center of the CBTN, to implement the FHIR data exchange model across the CBTN’s data ecosystem and beyond.

Initially, the data team at D3b worked to collate the most broadly utilized data models to implement a FHIR-based framework across each of the 30 datasets within the NIH Common Fund-supported Gabriella Miller Kids First Data Research Center, in which CBTN’s Pediatric Brain Tumor Atlas (PBTA) is a part, and of which CHOP’s D3b also serves as Center of Operations.

“By using the FHIR data model, we hope to create an interoperable data exchange across institutions,” saids Dr. Meen Chul Kim, PhD, the lead data expert on CBTN’s FHIR project and a key driver of its implementation across the D3b, CBTN, and Kids First data ecosystems. “By adopting a FHIR data exchange model, CBTN and D3b will be able to harmonize to a vastly expanded number of external datasets, in an interoperable and scalable manner. In order to speak the same language, we need a common data model and layers of data sharing.”

Working Toward a Unified Data Exchange Landscape

Working in concert with the National Cancer Institute’s Childhood Cancer Data Initiative (CCDI), CBTN/D3b is the first partnering entity to commit entirely to use of a FHIR framework. This has led to the establishment of the very first NCI FHIR Server.

At present, the scope of data within the Gabriella Miller Kids First Data Resource is limited to de-identified data relating to the patient, diagnosis, phenotype, outcome, etc. Now, the D3b data team hopes to accomplish an amplification of CBTN data for CCDI, using CHOP's Electronic Health Record (EHR) system and tumor registry to provide a broader picture of each patient’s treatment journey.

And because CBTN’s Pediatric Brain Tumor Atlas can now be harmonized to any other dataset via FHIR, researchers anywhere and in any research area can harness these data to incorporate into their own studies. This is an important next step in the fight against childhood cancer, allowing us to take advantage of the latest technology to put valuable resources into the hands of as many researchers as possible while saving them precious time and effort.