Skip to main content

Genetic Variability Across Major Histocompatibility Complex

The major histocompatibility complex (MHC) has been documented to be involved in a variety of immunological processes; however, it often eludes understanding because of the complexity of the region due to factors like strong linkage disequilibrium (LD) and highly polymorphic loci. To better understand this region of the genome and how it ties to diseases, we will look at genic patterns as well as regulatory interactions within the region. It is known that coding and regulatory elements are more evolutionarily conserved because of their functional importance, so the study will look for patterns of genetic constraint to help identify regions of functional importance. Whole genome data allows for analysis of rare variation in non-coding regulatory regions. The problem requires clear whole genome sequence samples that can be phased effectively to characterize the underlying genomic variability. From whole genome trios and families’ haplotypes may be inferred more accurately, where otherwise imputing phased samples is difficult or impossible due to LD structure and the highly polymorphic nature of the region. Having phased whole genome sequence data will allow for the clearer determination of regulatory patterns of genic variation across the region.
Early onset disorders are typically under strong evolutionary selective pressures, which means detrimental genetic variation may play a more significant role and maybe in turn clearer to detect. The study will quantify burden of rare mutational load found locally clustered in case individuals to make inference on deleterious variation focusing in on the MHC. This in conjunction with modeling the regional constraint will elucidate both regulatory patterns and disease associations. The methodological development will further the understanding of the genetic variability, interactions, and in turn the basis of early onset genic disorders. The information and analysis will leverage public data sources to get an integrated look to better understand genic variations along with how and why the MHC is integral in different disease settings. The data from the Kids First dataset will only be used in research consistent with the data use limitations of this study. The corresponding results, summary analysis, and methodological advances will be published. Use of this data will not include the study of population origins or ancestry.