Research

The convergence of AI and engineering biology is redefining how we approach complex genetic diseases such as lysosomal storage disorders (LSDs). These rare, devastating conditions—driven by enzyme deficiencies that cause toxic metabolite buildup—have have remained difficult to cure through conventional drug discovery pipelines. Now, using AI, we can design optimized biotherapeutics with unprecedented precision, and build new therapies, ranging from custom enzyme variants to cell-therapies, that can restore lysosomal function, Together, these technologies are moving us from symptom management to true molecular repair. By integrating data-driven insights with wet-lab validation, researchers can iterate faster, reduce experimental uncertainty, and personalize interventions to each patient’s molecular signature. The result is an emerging paradigm where machine learning does not just guide discovery—it becomes an active collaborator in engineering biology. For lysosomal storage disorders, this synergy offers real hope: scalable, precise, and potentially curative therapies for all patients.

Our mission

Developing next generation therapies for Lysosomal Storage Diseases

Enzymes are building blocks of cellular life and act as natural catalysts able to accelerate almost any reaction. Enzymatic deficiencies are usually associated with devastating rare diseases, which can only be treated by providing the defective enzyme through intravenous injections, a treatment knows enzyme replacement therapy (ERT). However, recombinant enzymes loose catalytic activity in blood and often cause a severe immune response. Moreover, current manufacturing technologies are sub optimal, which lead to a dramatically high cost of treatment.
We are building on our expertise in AI and synthetic genomics to design and build a next-generation of enzyme replacement therapies. Our goal is to establish technologies to optimise the therapeutic properties of recombinant enzymes and to engineer expression systems for inexpensive production of these molecules, in order to provide new sustainable treatment for patients with lysosomal storage diseases, particularly focusing on Fabry disease.

Our technologies

Generative Artificial Intelligence for biologics engineering

Designing biologics is a complex task, which requires identifying amino acids or nucleotide changes to maximise their therapeutic properties or maximising manufacturing yield. Since the number of possible changes easily outnumbers the total number of atoms in the Universe, our group takes advantage of the wealth genomic, transcriptomic and proteomic data available to train generative AI models to sample this vast design space to identify candidates for downstream experimental screening.
Our backbone framework are variational autoencoders, which we optimise for training and design efficiency using our experience built across 20 years of statistical learning, optimisation and biological data analysis methods development.
Our methods build on robust engineering frameworks and industry standards, such as Python, Git, GitHub and GPU computing, and deployed using the Nextflow workflow management systems.

Genome engineering for biologics production

Recent advances in DNA synthesis, high-throughput sequencing and computer aided design (CAD) tools allow us to engineer the genomes of living cells and address questions and tackle problems intractable with standard technologies. We pioneered CAD software for synthetic genome engineering, which has been instrumental to design the synthetic yeast genome (Saccharomyces cerevisiae 2.0, Sc2.0) the first synthetic eukaryotic genome ever built. Sc2.0 allows us to address a number of open questions in genome biology, including the identification of a minimal eukaryotic genome compatible with life.
We are now building on our expertise in synthetic genomics and engineering biology to engineering microbial and mammalian cells to optimise biologics production. Our workhorse systems are Komagathella Phaffi (aka P. Pastoris) and Chinese Hamster Ovary (CHO) cell lines, primarily engineered to efficiently produce human lysosomal enzymes.

Automated high-throughput biology

Artificial Intelligence models can generate thousands of putatively functional biologics, but experimental testing remains paramount to make sure they are effective for therapeutic use. However, classical, low throughput manual experimental protocols do not scale for an AI-driven drug discovery workflow.
To make sure our experimental workflow keeps up with our generative AI models, we are developing miniaturized and fully automated protocols for protein expression in both Komagathella Phaffi (aka P. Pastoris) and Chinese Hamster Ovary (CHO), using high-throughput electroporation systems, liquid handling robots, and highly throughput assays.
Our lab works in close collaboration with the Edinburgh Genome Foundry to scale-up our experimental work.

Other areas of interest

Cancer genetics and genomics

Decades of research have shown that genomic mutations in key genic regions are responsible for the transformation of normal cells into cancer cells. However, while a causal role for somatic mutations has been shown for many common malignancies, the role of high frequency inherited mutations has remained elusive. We are addressing this question by developing statistical learning methods to dissect the heritable risk of cancer at the gene level. Nonetheless, the polygenic architecture of cancer requires linking gene level information into pathways. To do that, we are developing deep graph neural networks to integrate transcriptomic and proteomic data and infer aberrant pathways involved in cancer metabolism and affecting response to therapy.