| Abstract: |
Gene regulation is a central biological process whose
disruption can lead to many diseases. This process is largely controlled
by a dynamic network of transcription factors interacting with specific
genes to control their expression. Time series microarray gene
expression experiments have become a widely used technique to study the
dynamics of this process. This thesis introduces new computational
methods designed to better utilize data from these experiments and to
integrate this data with static transcription factor-gene interaction
data to analyze and model the dynamics of gene regulation. The first
method, STEM (Short Time-series Expression Miner), is a clustering
algorithm and software specifically designed for short time series
expression experiments, which represent the substantial majority of
experiments in this domain. The second method, DREM (Dynamic Regulatory
Events Miner), integrates transcription factor-gene interactions with
time series expression data to model regulatory networks while taking
into account their dynamic nature. The method uses an Input-Output
Hidden Markov Model to identify bifurcation points in the time series
expression data. While the method can be readily applied to some species
the coverage of experimentally determined transcription factor-gene
interactions in most species is limited. To address this we introduce
two methods to improve the computational predictions of these
interactions. The first of these methods, SEREND (SEmi-supervised
REgulatory Network Discoverer), motivated by the species E. coli is a
semi-supervised learning method that uses verified transcription
factor-gene interactions, DNA sequence binding motifs, and gene
expression data to predict new interactions. We also present a method
motivated by human genomic data that combines motif information with a
probabilistic prior on transcription factor binding at each location in
the organism's genome, which it infers based on a diverse set of genomic
properties. We applied these methods to yeast, E. coli, and human cells.
Our methods successfully predicted interactions and pathways, many of
which have been experimentally validated. Our results indicate that by
explicitly addressing the temporal nature of regulatory networks we can
obtain accurate models of dynamic interaction networks in the cell.
Thesis Draft: http://www.cs.cmu.edu/~jernst/thesisdraft.htm
Thesis Committee:
Ziv Bar-Joseph (Chair), Zoubin Ghahramani, Eric Xing, Naftali Kaminski (PITT), Zoltan Oltvai (PITT) |