Computational Methods for Paleogenomics and Comparative Genomics


Genome evolution

One of our favorite research problems, an historically the initial research of ur group, is the reconstruction of ancestral genome structures (genome maps based on synteny blocks, or ancestral gene orders) [MMB 2018]. Our main approach follows the comparative paradigm, that assumes that conserved features of related extant genomes indicate potential ancestral genome features. We work within two methodological frameworks for this problem: a local approach, that considers a single ancestral genome within a given species phylogeny [PLoS Comput Biol (2008), ANGES (2012)], and a global (aka small parsimony approach), that considers all ancestral genomes of a species phylogeny at once [PhySca (2017)].

Lately, we aim to extend these approaches in order to work within a model accounting for gene family events such as gene duplication, loss or transfer. In this approach, we developed a probablistic approach within the DeCo* algorithmic framework [DeCoSTAR (2017)]. This line of work also motivated a series of papers on the correction of gene trees [PolytomySolverNAD (2014)] and on the reconciliation between gene trees and species trees [ecceTERA (2016)].

Ancient DNA

The recent breakthroughs in ancient DNA (aDNA) sequencing naturally complements complement methods for reconstructing ancient genomes. This motivated our project to assemble recently sequenced historical samples of the pathogen Yersinia pestis, showing that it is possible to go beyond single nucleotide mutations to analyze ancient pathogens data [FPSAC (2013), AGaPES (2017)].

Anopheles mosquito genomics

We are also currently applying our comparative genomics and genome rearrangement algorithms to a fascinating, large-scale, data set composed of toughly twenty Anopheles mosquito genomes [Science (2015)]. This in turn raises interesting questions on how to handle fragmented genome assemblies in genome rearrangement studies [ArtDeCo (2015)].

Pathogen genomics

We recently started a very active collaboration with the SFU Computational Epidemiology lab of Dr. Leonid Chindelevitch, and Dr. Will Hsiao (BC center for Disease Control), focusing on the development and application of novel bioinformatics tools for the analysis of whole-genome sequencing data of microbial pathogens. We recently published our first papers on this topic [Beaver fever (2018), MentaLiST (2018)].

Next-Generation Sequencing algorithms

In a recently started collaboration with Faraz Hach (UBC and Vacouver Prostate Center) we are developing novel and efficient algorithms for processing large genomic and transcriptomic sequence data sets, mostly motivated by questions from cancer genomics. We are currently focusing on the analysis of Third Generation Sequencing data (PacBio and Nanopore) [CoLoRMap (2016), LRCstats (2017)].

Big data flow cytometry bioinformatics

This topic is relatively new in our lab. It stems from a starting collaboration with Ryan Brinkman (BC Cancer Agency) and Sara Mostafavi (UBC), funded by Genome Canada. Our first paper was just released [flowLearn (2018)].

RNA secondary structures comparison

Another question we investigated very early. Here we are mostly interested in developing accurate and efficient algorithm to compare pairs of RNA secondary structures. Among others, we have been involved in the development of the BRASERO benchmark, the analysis of the [BRALIBASE benchmark dent (2016)], and in the design of efficient dynamic programming algorithms for comparing RNA secondary structures [AlCoB (2016) , RNA-unchained (2015)].