High-throughput technologies have generated a deluge of data describing living
organisms at unprecedented resolution. The wealth and diversity of biological
data available provide unique opportunities to understand the design principles
underpinning cellular life. The democratisation of DNA synthesis allow us to use
these design rules to engineer new designer systems ranging from genes to
Our long term goal is to reverse-engineer biological systems to create
generative algorithms to design, build and test biochemical agents for addressing
healthcare and industrial biotechnology problems.
Recent advances in DNA synthesis, high-throughput sequencing and computer aided
design (CAD) tools allow us to engineer the genomes of living cells and address
questions and tackle problems intractable with standard technologies.
We pioneered CAD software for synthetic genome engineering, which has been
instrumental to design the synthetic yeast genome
(Saccharomyces cerevisiae 2.0
the first synthetic eukaryotic genome ever built.
Sc2.0 allows us to address a number of open questions in genome biology, including
the identification of a minimal eukaryotic genome compatible with life.
Synthetic chromosomes represent also a flexible chassis to integrate synthetic pathways
into existing expression systems. However, the design principles to build new,
functional chromosomes are mostly unknown, and the current technology limits the DNA
molecules that can be synthesized.
We are addressing these issues by developing statistical models to learn how wild-type
genomes change upon the integration of synthetic chromosomes, and by developing methods
to optimise manufacturing of chromosome scale molecules, by repositioning mathematical
programming methods we developed for electronic engineering.
Our lab works in close collaboration with the Edinburgh Genome Foundry
to scale-up our experimental work.
Synthetic human enzyme engineering
Enzymes are building blocks of cellular life and act as natural catalysts able
to accelerate almost any reaction. Enzymatic deficiencies are usually associated
with devastating rare diseases, which can only be treated by providing
the defective enzyme through intravenous injections. However, enzymes loose
catalytic activity in blood and often cause a severe immune response; moreover,
current manufacturing technologies have low yield, which dramatically raises the cost of
Here we are building on our expertise in machine learning and synthetic genomics
to design and build human enzymes at scale. Our goal is to establish technologies
to optimise the therapeutic properties of synthetic enzymes and to engineer
expression systems for inexpensive production of these molecules, in order to
provide new sustainable treatment for patients with rare metabolic disorders.
Cancer genetics and genomics
Decades of research have shown that genomic mutations in key genic regions
are responsible for the transformation of normal cells into cancer cells.
However, while a causal role for somatic mutations has been shown for many
common malignancies, the role of high frequency inherited mutations has
We are addressing this question by developing statistical learning methods
to dissect the heritable risk of cancer at the gene level.
Nonetheless, the polygenic architecture of cancer requires linking gene level
information into pathways.
To do that, we are developing deep graph neural networks to integrate transcriptomic and proteomic
data and infer aberrant pathways involved in cancer metabolism and affecting response to therapy.
While causal somatic mutations has been identified for many cancers, the genomic
landscape of rare tumours is mostly unknown. In collaboration with Blood Cancer Research Group
at University of Ostrava, we are completing the sequencing of the genome of rare blood cancers, such as
multiple myeloma minimal residual disease and extramedullary myeloma.
Computational biology algorithms and software engineering
Computational methods are now cornerstone of modern biology. The
lab is committed to release high-quality, open-source tools that can be
easily integrated into analysis workflows.
To do that, we adopt software
engineering principles and methods that are standard in industry. Currently,
our ecosystem relies on Python, Git, GitHub and GPU computing.
All our analyses are implemented using either Nextflow or Snakemake
workflow management systems. We also maintain a collection of Docker containers
to facilitate the adoption of our tools. You can check our growing suite of software on
- Genetics Society UK