My research interests and projects
My interests lie primarily within the Artificial Intelligence (AI) field, and more particularly
MachineLearning methods. I personally view MachineLearning as the subdomain of AI when a machine
can be labelled as intelligent because it "learns" something, generally by observing human behavior or
collecting huge amounts of data. So, instead of implementing explicitly rules that will make your
machine behave like an human, you just make the machine look at what the human does, and then replicate it...
There are lots of possible applications of machinelearning. It has been succesfully used in speech processing,
sensornetworks, diagnostics (it is not often used in games AI yet, probably because the models would be too complex).
However, my own research focuses more on the theoretical side of machinelearning than its applications. I try to design new
efficient algorithms for performing inference on hard statistical networks. As such, I am in fact a researcher in the
probabilistic machine learning field.
Hard probabilistic problems have a strange but very direct connection to graph theory. It turns out that many probabilistic
problems can be expressed within a graphical framework; such models are called probabilistic graphical models. Once you have
modelled your problem, several various exact and/or approximate inference algorithms can be run on the created graph (which
very underlying structure depends on the independence/dependence assumptions of the original problem).
With Nando de Freitas, my supervisor, we are exploring better/faster ways of doing approximate inference. All our work is
heavily based on computational statistics, and since we are doing approximate inference, I am very familiar with a wide array
of Monte Carlo methods (both Markov Chain Monte Carlo and Sequential Monte Carlo).
Since it turns out that on trees, better inference algorithms are available, the ideas at the heart of my research is to
partition a general undirected graph into several trees. You then run inference algorithms on these trees, and try to somehow
combine the individual results of each of these algorithms into something giving you the marginals (or means, or whatever
estimator you are interested in) for the whole graph.
Latest update:
my Msc. thesis is completed!
This thesis contains most of my research results. Along with the thesis,
an inference toolbox has also been written in C++ with a Matlab interface.
I am maintaining
a page related to this software
(it contains a tutorial, the C++ sources and downloadable binaries for various platforms).
Below is a list of all the projects I'm currently involved in or recently completed.
In machinelearning, Markov Chain Monte Carlo (MCMC) strategies such as Gibbs sampling are important
approximate inference techniques. They use a Markov Chain mechanism to explore and sample the state
space of a target distribution. The generated samples are then used to approximate the target
distribution.
MCMC is mathematically guaranteed to converge with enough samples. Yet some complex graphical models
can cause it to converge very slowly to the true distribution of interest. Improving the quality
and efficiency of MCMC methods is an active topic of research in the probabilistic graphical
models field. One possible method is to “block” some parts of the graph together, sampling groups
of variables instead of single variables.
In this document, we concentrate on a particular blocking scheme known as tree sampling. Tree
sampling operates on groups of trees, and as such requires that the graph be partitioned
in a special way prior to inference. We present new algorithms to find tree partitions on
arbitrary graphs. This allows tree sampling to be used on any undirected probabilistic graphical model.
This is my thesis for the completion of my Msc. degree at UBC. It contains almost all my research
results in inference algorithms for undirected graphical models.
Useful links:
[This contains documentation about the software framework I wrote for my thesis, named PGM. This is an
inference toolbox written in C++, capable of running various inference algorithms described in my thesis.
Sources and binaries are available.]
(10/10/2005)
This is the report for the project I did for
Nando de Freitas
MachineLearning course, during the Winter 2003 term. Clustering data collections (such as text corpora) into
different groups (corresponding to different "styles") is a very frequent application of MachineLearning. Whereas
a simple multinomial model allows documents to belong to only one cluster or style, the LDA model allows an unique
document to exhibit a mixture of different styles, thus is much more flexible.
My project focused on understanding the LDA model and applying it to Markov chains documents (and not to simple
text documents, as it was first intended for). Markov chains can be applied in a wide range of applications
(notably, games); on this paper I used Markov chains as a very simple representation for musical data.
Useful links:
[The initial code I based my work upon]
(15/12/2003)
Here are links to my
publications
or to my
old (or very old) projects.
