Computational Methods for Paleogenomics and Comparative Genomics


Current research

Algorithms for NGS data analysis

In a fruitful collaboration with Faraz Hach (UBC and Vacouver Prostate Center) we are developing novel and efficient algorithms for processing large genomic and transcriptomic sequence data sets, mostly motivated by questions from cancer genomics. We are currently focusing on the analysis of Third Generation Sequencing data (PacBio and Nanopore) [TKSM (2024), scTagger (2022), Freddie (2022), Genion (2022), HASLR (2020)]. We are also working on data from other sequencing protocols such as barcoded short reads [Calib (2019)] and linked reads (collaboration with Rayan Chikhi) [WABI 2020].

Pathogen genomics, plasmids bioiformatics

We develop novel bioinformatics tools for the analysis of whole-genome sequencing data of microbial pathogens, with a focus on plasmids bioinformatics (in collaboration with Tomas Vinar and Brona Brejova) [PlasEval (2024), PlasBin-flow (2024), plASgraph2 (2023), HyAsP (2019), MentaLiST (2018), PathoGiST (2020)].

Genome evolution

One of our favorite research problems, historically the initial research of our group, is the reconstruction of ancestral genome structures (genome maps based on synteny blocks, or ancestral gene orders) [MMB 2024, MMB 2018]. We work within two methodological frameworks for this problem: a local approach, that considers a single ancestral genome within a given species phylogeny [PLoS Comput Biol (2008), ANGES (2012)], and a global (aka small parsimony approach), that considers all ancestral genomes of a species phylogeny at once using either a whole-genome approach [SPP-DCJ (2021), PhySca (2017)] or a gene-adjacencies based approach [DeCoSTAR (2017)].

Phylogenomics and phylogenetics

Motivated by the problematic of gene evolution within the context of genome evolution, we work on phylogenetic trees and networks [
CEDAR (2025)] and reconciled gene trees [PGE (2020), SuGeT (2018)].

Combinatorics problems motivated by bioinformatics questions

We are also interested in applications of enumerative combinatorics techniques to theoretical questions motivated by bioinformatics problems such as RNA secondary structures alignment [IJFCS 2018], RNA design [BCB 2019], gene trees counting [JMB 2020], and sequence alignment [PSC 2021].

Past research

Anopheles mosquito genomics and comparative scaffolding

We did apply our comparative genomics and genome rearrangement algorithms to a fascinating, large-scale, data set composed of toughly twenty Anopheles mosquito genomes [Science (2015)]. This in turn raises interesting questions on how to handle fragmented genome assemblies in genome rearrangement studies [ArtDeCo (2015), ADSeq (2018), BMC Biology (2020)].

Ancient DNA

The recent breakthroughs in ancient DNA (aDNA) sequencing naturally complements complement methods for reconstructing ancient genomes. This motivated our project to assemble recently sequenced historical samples of the pathogen Yersinia pestis, showing that it is possible to go beyond single nucleotide mutations to analyze ancient pathogens data [FPSAC (2013), AGaPES (2017)].

Big data flow cytometry bioinformatics

This topic stems from a collaboration with Ryan Brinkman (BC Cancer Agency) and Max Libbrecht (SFU), funded by Genome Canada and NIH [flowGraph (2019), flowLearn (2018)].