Current research
Algorithms for NGS data analysis
In a fruitful collaboration with Faraz Hach (UBC and Vacouver Prostate Center)
we are developing novel and efficient algorithms for processing large genomic and transcriptomic sequence
data sets, mostly motivated by questions from cancer genomics. We are currently focusing on the analysis of
Third Generation Sequencing data (PacBio and Nanopore)
[
TKSM (2024),
scTagger (2022),
Freddie (2022),
Genion (2022),
HASLR (2020)].
We are also working on data from other sequencing protocols such as barcoded short reads
[
Calib (2019)]
and linked reads (collaboration with Rayan Chikhi)
[
WABI 2020].
Pathogen genomics, plasmids bioiformatics
We develop novel bioinformatics tools for the analysis of whole-genome sequencing data of microbial pathogens, with a focus on plasmids bioinformatics (in collaboration with Tomas Vinar and Brona Brejova)
[
PlasEval (2024),
PlasBin-flow (2024),
plASgraph2 (2023),
HyAsP (2019),
MentaLiST (2018),
PathoGiST (2020)].
Genome evolution
One of our favorite research problems, historically the initial research of our group,
is the reconstruction of ancestral genome structures (genome maps based on synteny blocks, or
ancestral gene orders)
[
MMB 2024,
MMB 2018].
We work within two methodological frameworks for this problem: a local approach, that
considers a single ancestral genome within a given species phylogeny
[
PLoS Comput Biol (2008),
ANGES (2012)], and a global (aka small parsimony approach), that considers all
ancestral genomes of a species phylogeny at once using either a whole-genome approach
[
SPP-DCJ (2021),
PhySca (2017)]
or a gene-adjacencies based approach
[
DeCoSTAR (2017)].
Phylogenomics and phylogenetics
Motivated by the problematic of gene evolution within the context of genome evolution, we work on phylogenetic trees and networks
[CEDAR (2025)]
and reconciled gene trees
[
PGE (2020),
SuGeT (2018)].
Combinatorics problems motivated by bioinformatics questions
We are also interested in applications of enumerative combinatorics techniques to theoretical questions motivated by
bioinformatics problems such as RNA secondary structures alignment [
IJFCS 2018],
RNA design [
BCB 2019],
gene trees counting [
JMB 2020], and sequence alignment
[
PSC 2021].
Past research
Anopheles mosquito genomics and comparative scaffolding
We did apply our comparative genomics and genome rearrangement algorithms to
a fascinating, large-scale, data set composed of toughly twenty
Anopheles mosquito genomes
[
Science (2015)].
This in turn raises interesting questions on how to handle fragmented genome assemblies in genome
rearrangement studies [
ArtDeCo (2015),
ADSeq (2018),
BMC Biology (2020)].
Ancient DNA
The recent breakthroughs in ancient DNA (aDNA) sequencing naturally complements
complement methods for reconstructing ancient genomes. This motivated our project to
assemble recently sequenced historical samples of the pathogen
Yersinia pestis,
showing that it is possible to go beyond single nucleotide mutations to analyze ancient
pathogens data
[
FPSAC (2013),
AGaPES (2017)].
Big data flow cytometry bioinformatics
This topic stems from a collaboration with Ryan Brinkman (BC Cancer Agency) and Max Libbrecht
(SFU), funded by Genome Canada and NIH
[
flowGraph (2019),
flowLearn (2018)].