Repository logo
 

Rationalising and accelerating enzyme engineering by quantitative sequence-function profiling in microdroplets


Loading...
Thumbnail Image

Type

Change log

Abstract

Enzymes catalyse nearly all reactions in Nature's vast chemical repertoire with high precision and under sustainable conditions. To apply them in cost-effective and sustainable biocatalytic production processes in chemical industry, enzymes first need to be modified or "engineered" to work with unnatural substrates and reaction conditions. Conventional methods so far have not succeeded in resolving this bottleneck: directed evolution - a laborious iterative cycle of mutagenesis and experimental testing - remains a hit-and-miss approach, while computational methods based on artificial intelligence (AI) are currently hampered by a lack of sequence-function data to learn from. In this thesis I introduce a new microfluidic sequence function mapping workflow, long-read deep mutational scanning (lrDMS), which allows us to screen the combinatorial effects of tens of thousands of mutations within two weeks, a scale, speed, and cost not feasible with robotic plate screening workflows. By obtaining mutagenic profiles via lrDMS the enzyme engineering paradigm is changed from focusing on the effects of individual residues to performing a network analysis that covers the entire protein structure, enabling the protein engineer to navigate complex intra-gene epistasis. Using this workflow, we generate a large-scale sequence-function dataset describing an imine reductase - an enzyme class with broad application in industrial biocatalysis - and rationally engineer improved variants with an up to 11-fold improvement in catalytic efficiency. With machine learning, we further enhance kcat up to 24-fold vs wild type, nearly one order of magnitude better than the best variant in the dataset. The improvement is driven by non-specific positive epistasis on catalytic efficiency via the mutant T241A that synergistically interacts with a set of different mutations spread all over the enzyme in a "global" manner. Using X-ray crystallography, NMR and mechanistic enzymology we identify a change in the dynamic equilibrium as the cause of positive epistasis mediated by T241A. We also find that T241A mediates activity improvements in homologs with as little as 50% sequence identity indicating that lrDMS profiles gathered with one enzyme can inform other enzyme engineering campaigns. We obtained lrDMS profiles for SrIRED with two more substrates revealing the complex encoding of substrate specificity in the whole enzyme structure and portability of single mutation data to other substrates. To showcase that sequence-function profiling via lrDMS can accelerate engineering of applied biocatalysts we engineered an IRED from Penicillium camemberti for the synthesis of the drug Tecalcet achieving a 3-fold improvement in conversion in one round, akin to what a comparable campaign with a similar enzyme achieved in three rounds of extensive directed evolution. This showcases how combining microfluidic sequence-function mapping and rational and ML-enabled in silico extrapolation substantially accelerates enzyme engineering. In the age of predictive biology, these maps will chart the way to improved function by exploiting the synergy of rapid experimental screening combined with ML evaluation and extrapolation.

Description

Date

2025-03-30

Advisors

Hollfelder, Florian

Qualification

Doctor of Philosophy (PhD)

Awarding Institution

University of Cambridge

Rights and licensing

Except where otherwised noted, this item's license is described as All rights reserved
Sponsorship
Trinity College Benn W Levy studentship

Relationships

Is supplemented by: