· Wei (Will) Yang, Ph.D.
· Bo Zhang, Ph.D.
· Sumithra Sankararaman, Ph.D.
· Ni Huang, Ph.D.
Former Fellows of the Program
· Oscar M. Harari, Ph.D.
· Sean D. Kristjansson, Ph.D.
· Ruth Huang Miller, Ph.D.
· Jiexun (Jessie) Wang, Ph.D.
· Xin Zhou, Ph.D. - completed 3 year fellowship; awarded an additional year 2014-2015.
Some Topics Trainees Are Working On
• Extending the ALIGATOR computational method to allow specification of regions of the chromosome/gene to be analyzed.
• Developing a Genomics Browser function to display transposable elements data on a genome-wide scale.
• Developing a new statistical framework to investigate DNA methylation signatures of different cell types and of different individuals
• Developing genetic data-analysis methods and tools to facilitate the analysis of Human Connectome Project and other genetic functional connectivity data-sets
• Dr. Zhang is working to improve the M&M statistical framework, so that it will become more dynamic and robust to support comparison among groups, and to support multi-types of biological data, e.g., histone modification ChIP-seq.
• Dr. Yang is focusing on developing and applying novel statistical and computational methods for analyzing high-dimensional data from genetic studies of complex diseases. Developments are aimed to address the new challenges faced by analysis of the large number of rare variants in typical NGS (next-generation sequencing) studies of complex diseases, and to test family datasets.
• Dr. Huang’s project involves the extension of program initially developed by his primary mentor Dr. Conrad to discover de novo CNVs in family-based genetic data.
• Dr. Sankararaman is investigating various mechanisms involved in gene regulation at the transcriptional level and at the epigenetic level, and work to develop mathematical models and associated tools that might be applied in the design of new genetic research and therapeutic interventions.
Program Products: Websites or Other Internet Sites Involving Trainees
· WashU EpiGenomics Browser ... http://epigenomegateway.wustl.edu/browser/
Dr. Xin Zhou has built the WashU EpiGenomics Browser a web portal of curated data sets, and visualization engine. Curated data sets include over 5000 sequencing experiments from Roadmap Epigenomics and ENCODE projects. With the Browser, the user can easily identify interested experiments through facet browsing, navigate the genomics data with intuitive pan-and-zoom, and display dozens of experiments while keeping informed with the integrated metadata visualization system. The user can choose from an array of applications to apply to the track data and analyze it from a different perspective, including "genomic juxtaposition" to focus on a subset of the genome, "genome snapshot" to view global profile, "gene set view" to focus on a pathway, and "secondary panel" for dual-panel display. The Browser is adept at accessing and managing remote resources from the Internet, which allows user to visualize private data sets, to bookmark track collection, and to share the resource with collaborators. The Browser is also able to visualize chromatin-interaction data, which captures the 3 dimensional structure of chromatin in the cell nucleus. The Browser displays the interaction data as intuitive heatmap or arc styles, and is able to visualize inter-chromosomal interactions. An app is available from the Browser to look at genome-wide interaction patterns. With this the Browser integrates chromatin-interaction data with epigenomics data sets so that user can view and explore these two vastly different types of data in the same setting. Dr. Zhou’s ongoing work includes collaborating with the ENCODE 3 project and Canadian Epigenome project, in which the Browser will be used as the visualization component in the consortium's informatics pipeline.
· The Repeat Browser ... http://epigenomegateway.wustl.edu/browser/repeat/
Dr. Xin Zhou has developed The Repeat Browser which condenses hundreds of experiments into a heatmap, and allows investigators to identify data pattern associated with sample, assay, or TE class by refactoring the heatmap. To focus on specific TE subfamily, investigators can view all copies from this TE subfamily in a genome graph, and overlay one or more experimental assay results on this graph to identify TE copies with special experimental profile. Investigators can observe genomic context (e.g. nearby genes) of specific TE copies through integrated browsers. The TE copies can also be ranked and displayed as Gene Set View. The Repeat Browser presents TE-centric perspective of genome-wide assaying results which is impossible to achieve via conventional genome browsers.
· The Complete Epigenome Display ... http://epigenomegateway.wustl.edu/browser/roadmap/
Dr. Xin Zhou also developed The Complete Epigenome Display is an open web service, based on WashU EpiGenome Browser. Epigenomics assaying results of each human sample type are displayed in an individual Browser panel. Investigators can browse the catalog of human samples denoted as "complete epigenomes", and have multiple samples shown as multiple Browser panels displayed simultaneously. Investigators can scroll and tune each Browser panel independently, or synchronize display range of all panels for easy navigation and comparison. Powerful data visualization techniques including gene set view, genome juxtaposition, and chromatin-interaction display are all available in the Complete Epigenome Display, making it an ideal tool for exploring Roadmap Epigenomics data set.
· methylMnM ... http://epigenome.wustl.edu/MnM\
Dr. Bo Zhang is currently working on a statistical framework to be applied to data on DNA methylation. Dr. Zhang has worked with his mentor (Dr. Ting Wang) to develop a new statistical framework called “M&M” to integrate both MeDIP and MRE data. A modified T-statistic has been applied to identify differentially methylated regions (DMR) between two samples. This new framework has been published as “methylMnM” and is currently being applied to different projects, including zebra fish embryonic development study, human cancer studies, rat model addiction studies, etc.
· Genotype-Tissue Expression (GTEx) ... http://commonfund.nih.gov/GTEx/
Dr. Ni Huang's project includes developing statistical methods adapted to data generated from single-cell DNA sequencing and RNAseq of a variety of tissues from a large cohort of individuals produced by the GTEx (Genotype-Tissue Expression) project. The method uses a tree-based Bayesian model to call de novo mutations and infer their mutation history. DenovoGear is a tool developed by Dr. Don Conrad originally for discovering germ line de novo SNPs from trio sequencing data [Nat Genet, 2011. 43(7):712–714].