High-throughput technologies have generated a deluge of data describing living organisms at unprecedented resolution. The wealth and diversity of biological data available provide unique opportunities to understand the design principles underpinning cellular life. The democratisation of DNA synthesis allow us to use these design rules to engineer new designer systems ranging from genes to chromosomes.

Our long term goal is to reverse-engineer biological systems to create generative algorithms to design, build and test biochemical agents for addressing healthcare and industrial biotechnology problems.

Synthetic genomics

Recent advances in DNA synthesis, high-throughput sequencing and computer aided design (CAD) tools allow us to engineer the genomes of living cells and address questions and tackle problems intractable with standard technologies. We pioneered CAD software for synthetic genome engineering, which has been instrumental to design the synthetic yeast genome (Saccharomyces cerevisiae 2.0, Sc2.0) the first synthetic eukaryotic genome ever built. Sc2.0 allows us to address a number of open questions in genome biology, including the identification of a minimal eukaryotic genome compatible with life.
Synthetic chromosomes represent also a flexible chassis to integrate synthetic pathways into existing expression systems. However, the design principles to build new, functional chromosomes are mostly unknown, and the current technology limits the DNA molecules that can be synthesized. We are addressing these issues by developing statistical models to learn how wild-type genomes change upon the integration of synthetic chromosomes, and by developing methods to optimise manufacturing of chromosome scale molecules, by repositioning mathematical programming methods we developed for electronic engineering.
Our lab works in close collaboration with the Edinburgh Genome Foundry to scale-up our experimental work.

Synthetic human enzyme engineering

Enzymes are building blocks of cellular life and act as natural catalysts able to accelerate almost any reaction. Enzymatic deficiencies are usually associated with devastating rare diseases, which can only be treated by providing the defective enzyme through intravenous injections. However, enzymes loose catalytic activity in blood and often cause a severe immune response; moreover, current manufacturing technologies have low yield, which dramatically raises the cost of treatment.
Here we are building on our expertise in machine learning and synthetic genomics to design and build human enzymes at scale. Our goal is to establish technologies to optimise the therapeutic properties of synthetic enzymes and to engineer expression systems for inexpensive production of these molecules, in order to provide new sustainable treatment for patients with rare metabolic disorders.

Cancer genetics and genomics

Decades of research have shown that genomic mutations in key genic regions are responsible for the transformation of normal cells into cancer cells. However, while a causal role for somatic mutations has been shown for many common malignancies, the role of high frequency inherited mutations has remained elusive. We are addressing this question by developing statistical learning methods to dissect the heritable risk of cancer at the gene level. Nonetheless, the polygenic architecture of cancer requires linking gene level information into pathways. To do that, we are developing deep graph neural networks to integrate transcriptomic and proteomic data and infer aberrant pathways involved in cancer metabolism and affecting response to therapy.
While causal somatic mutations has been identified for many cancers, the genomic landscape of rare tumours is mostly unknown. In collaboration with Blood Cancer Research Group at University of Ostrava, we are completing the sequencing of the genome of rare blood cancers, such as multiple myeloma minimal residual disease and extramedullary myeloma.

Computational biology algorithms and software engineering

Computational methods are now cornerstone of modern biology. The lab is committed to release high-quality, open-source tools that can be easily integrated into analysis workflows.
To do that, we adopt software engineering principles and methods that are standard in industry. Currently, our ecosystem relies on Python, Git, GitHub and GPU computing.
All our analyses are implemented using either Nextflow or Snakemake workflow management systems. We also maintain a collection of Docker containers to facilitate the adoption of our tools. You can check our growing suite of software on GitHub.