Predictive Models for Transcriptome Variations

Asset 11.png

CBTN Data Used


National Institute of Health

Laboratory Start up funds

About this


Exome sequencing (ES) is the most advanced standard-of-care genetic test for children with multiple congenital anomalies. Yet, the diagnostic rate of ES is only 31%1–3. One challenge is that each test generates hundreds of thousands of variants that must be prioritized via a computational pipeline to a short list of candidates that are then manually reviewed by genomic professionals. To increase efficiency, current clinical pipelines generally remove nearly all intronic and synonymous variants. However, many such variants cause Mendelian disorders, and their removal leads to missed diagnoses. Although algorithms that predict pathogenicity for these non-coding variants exist, their clinical utility has been limited by high false positives and lack of functional interpretation

Ask The


Ask the scientists

What are the goals of this project?

The goal of this study is to map normal and disease associated transcriptome variations, then build predictive models for those

What is the impact of this project?

The goal is that this study will result in an interpretable algorithm for prioritizing general splicing variants that guides functional validation and identify novel variants and genes for mechanistic evaluation in the pathogenesis of congenital anomalies. This algorithm will be generally applicable, significantly enhancing our ability to provide molecular diagnoses for all patients with suspected Mendelian disorders.

Specimen Data

The Children's Brain Tumor Network contributed to this project by providing access to the Pediatric Brain Tumor Atlas.

Explore the data in these informatics portals


Meet The


PI: Yoseph Barash, PhD - University of Pennsylvania


  • Elizabeth Bohj, MD, PhD - Children's Hospital of Philadelphia
  • Jorge Vaquero (Bioinformatician) - University of Pennsylvania