Leveraging Data Science for Enhanced Expression and Production
Implementing Data and Models to Streamline the Process
11 November 2025 ALL TIMES WET (GMT/UTC)
The growing demand for recombinant proteins is driving the integration of data science with engineering strategies to optimize hosts and production workflows. This includes target gene verification, codon optimization, vector design, and clone/host selection in parallel with exploring high-throughput expression systems, data-driven design strategies, and workflow automation. Each requires careful analysis of complex variables. Cambridge Healthtech Institute’s 8th Annual Leveraging Data Science for Enhanced Expression and Production conference at PEGS Europe convenes protein and data scientists pioneering deep learning applications to enhance cell line engineering, protein expression, and scalable production strategies to streamline experiments and reduce time and costs.

Tuesday, 11 November

Registration and Morning Coffee

BUILDING AND LEVERAGING EXPRESSION PREDICTION MODELS

Chairperson's Welcome Remarks

Helena Maja Firczuk, PhD, Group Leader, Protein and Cellular Sciences, GSK , Team Leader , HT Expression , GSK

FEATURED PRESENTATION: FAIR Data to Predict Recombinant Protein Expression

Photo of Lovisa Holmberg Schiavone, PhD, Director, Protein Sciences, Structure & Biophysics, Discovery Sciences, R&D, AstraZeneca , Director , Protein Sciences, Structure @ Biophysics , AstraZeneca R&D
Lovisa Holmberg Schiavone, PhD, Director, Protein Sciences, Structure & Biophysics, Discovery Sciences, R&D, AstraZeneca , Director , Protein Sciences, Structure @ Biophysics , AstraZeneca R&D

We have leveraged internal recombinant protein production data that adhere to the F.A.I.R. guiding principles and large external datasets from the SGC to build a predictive model of E. coli-based protein production; RP3Net (Recombinant Protein Production Network) together with the EMBL-EBI. The model has been tested on a set of 46 proteins that were curated from the human proteome, avoiding proteins with prior published evidence of successful expression.

Utilising Learnings from High-Throughput Protein Expression Platforms to Enhance Delivery of Fit-for-Purpose Reagents

Photo of Helena Maja Firczuk, PhD, Group Leader, Protein and Cellular Sciences, GSK , Team Leader , HT Expression , GSK
Helena Maja Firczuk, PhD, Group Leader, Protein and Cellular Sciences, GSK , Team Leader , HT Expression , GSK

I will present an overview of GSK's high-throughput expression platforms and the advantages of employing them, such as streamlining delivery of protein reagents, enabling multiparameter optimisation of complex reagents generation. In addition, they also enable collecting vast amounts of well-curated data for human and machine learning. This data was used to build and parameterise a model to predict protein expression in various systems.

Closed Loop Autonomous Learning for Protein Engineering

Photo of James D. Love, PhD, Vice President, Cross Modality Workflows, Novo Nordisk AS , VP, Cross Modality Workflows , Novo Nordisk AS
James D. Love, PhD, Vice President, Cross Modality Workflows, Novo Nordisk AS , VP, Cross Modality Workflows , Novo Nordisk AS

Closed loop autonomous learning for protein engineering is a vision of a possible future that may connect AI to physical hardware, resulting in experimental design, execution, and analysis, while minimising human input. This is attractive, as it offers the possibility of more rapid discovery and development of therapeutic lead candidates. This talk will present the ongoing work at Novo Nordisk and our collaborators, and present some finding and future directions.

Grand Opening Coffee Break in the Exhibit Hall with Poster Viewing

INTEGRATING LEARNINGS FOR PROTEIN FORM AND FUNCTION

The SGC and Target 2035: Generating Proteins and Ligands to Enable Machine Learning

Photo of Nicola Burgess-Brown, PhD, Professorial Research Fellow, UCL, London; COO, Protein Sciences, Structural Genomics Consortium , Professorial Research Fellow , Pharma & Bio Chemistry , University College London
Nicola Burgess-Brown, PhD, Professorial Research Fellow, UCL, London; COO, Protein Sciences, Structural Genomics Consortium , Professorial Research Fellow , Pharma & Bio Chemistry , University College London

The SGC, a global public-private partnership, uncovers novel human biology through structural genomics and chemical biology approaches. Target 2035 aims to develop tool molecules for every human protein by creating massive open datasets of high-quality protein-small molecule binding data, using DNA-encoded libraries and affinity selection mass spectrometry platforms. Models built from these data will allow prediction of new and more drug-like small molecule binders, which will be tested experimentally.

Severe Deviation in Protein Fold Prediction by Advanced AI: Case Studies

Photo of Jacinto López Sagaseta, PhD, Head, Protein Crystallography and Structural Immunology Unit, Navarrabiomed , Head , Protein Crystallography , Navarrabiomed
Jacinto López Sagaseta, PhD, Head, Protein Crystallography and Structural Immunology Unit, Navarrabiomed , Head , Protein Crystallography , Navarrabiomed

Artificial intelligence and deep learning have significantly advanced structural biology, achieving unprecedented accuracy in modeling folds directly from amino acid sequences. Despite these advances, deviations from empirical structures are not uncommon, and experimental determination of protein folds remains vital for the advance of structural biology and biomedicine.

Luncheon in the Exhibit Hall with Poster Viewing

DECODING GENETIC RULES TO BOOST EXPRESSION

Chairperson's Remarks

Nicola Burgess-Brown, PhD, Professorial Research Fellow, UCL, London; COO, Protein Sciences, Structural Genomics Consortium , Professorial Research Fellow , Pharma & Bio Chemistry , University College London

Sequence-to-Expression Optimisation with Machine Learning

Photo of Diego Oyarzún, PhD, Professor of Computational Biology, University of Edinburgh , Professor of Computational Biology , School of Informatics , University of Edinburgh
Diego Oyarzún, PhD, Professor of Computational Biology, University of Edinburgh , Professor of Computational Biology , School of Informatics , University of Edinburgh

Thanks to progress in high-throughput DNA synthesis and sequencing, artificial Intelligence and machine learning have emerged as leading approaches for building sequence-to-expression models for strain optimisation. In this talk, I will discuss our recent progress on using this technology for designing novel regulatory and coding sequences with improved expression phenotypes, using a combination of supervised learning and optimisation algorithms.

Decoding the Rules of Genetic Syntax to Improve Transgene Design

Photo of Jarrod Shilts, PhD, Group Leader, ExpressionEdits Ltd. , R&D Lead Scientist , ExpressionEdits Ltd
Jarrod Shilts, PhD, Group Leader, ExpressionEdits Ltd. , R&D Lead Scientist , ExpressionEdits Ltd

Despite recent advances in our understanding of genetic features that promote robust protein expression, transgenes in biotechnology have remained largely unchanged for decades. Natural human genes are rich in intron sequences that can drive these crucial expression benefits, but were previously difficult to replicate in artificial transgenes. At ExpressionEdits, we're changing this by deciphering ‘genetic syntax’ using high-throughput screening and machine learning to design intronised transgenes with improved protein expression.

A Genetic Cure to Cell Line Instability

Photo of Louise Lindbaek, PhD, Team Lead, CHO Cell Line Engineering, Enduro Genetics ApS , Team Lead , CHO Cell Line Engineering , Enduro Genetics ApS
Louise Lindbaek, PhD, Team Lead, CHO Cell Line Engineering, Enduro Genetics ApS , Team Lead , CHO Cell Line Engineering , Enduro Genetics ApS

Manufacturing proteins in CHO and microbial cells faces challenges in maintaining high cellular productivity over many cell divisions. We have developed a plug-in gene technology that prevents cell line production instability. The plugins link cell growth to antibody secretion using biosensors that regulate essential genes in CHO cells and beyond. This technology enables stable production and supports continuous manufacturing, improving scalability and commercialisation of antibody therapies.

Selected Poster Presentation: AI In-Silico Screening Improves the Success Rate of Recombinant Protein Production

Photo of Evgeny Tankhilevich, Machine Learning Data Scientist, Industry Partnerships, EMBL-EBI , Student , Chemical Biology Svcs , EMBL EBI Hinxton
Evgeny Tankhilevich, Machine Learning Data Scientist, Industry Partnerships, EMBL-EBI , Student , Chemical Biology Svcs , EMBL EBI Hinxton

Refreshment Break in the Exhibit Hall with Poster Viewing

ALIGNING DATA AND BIOLOGY FOR INNOVATIVE R&D

Connecting Data, People, and Process: The Digital Transformation of Protein Production

Photo of Dominik Schneider, PhD, Senior Manager, R&D Enabling Technology, CSL Behring Innovation , Senior Manager , CSL Behring Innovations
Dominik Schneider, PhD, Senior Manager, R&D Enabling Technology, CSL Behring Innovation , Senior Manager , CSL Behring Innovations

Discover how CSL Behring’s digital ecosystem accelerates scientific progress in protein production. This presentation offers an overview of integrated digital tools and platforms that streamline workflows, enhance collaboration, and improve data management. Through real-world examples, learn how optimised processes drive key performance indicators, enabling more efficient, scalable, and innovative R&D efforts that support breakthrough therapies and advance biopharmaceutical development.

Panel Moderator:

FEATURED PANEL DISCUSSION:
Beyond the Bench: Making Data Work for Protein Science

Nicola Burgess-Brown, PhD, Professorial Research Fellow, UCL, London; COO, Protein Sciences, Structural Genomics Consortium , Professorial Research Fellow , Pharma & Bio Chemistry , University College London

Panelists:

Christopher Cooper, DPhil, Founder, Protein Sciences, Enzymogen Consulting , Founder and Consultant , Enzymogen Consulting

Lovisa Holmberg Schiavone, PhD, Director, Protein Sciences, Structure & Biophysics, Discovery Sciences, R&D, AstraZeneca , Director , Protein Sciences, Structure @ Biophysics , AstraZeneca R&D

James D. Love, PhD, Vice President, Cross Modality Workflows, Novo Nordisk AS , VP, Cross Modality Workflows , Novo Nordisk AS

Diego Oyarzún, PhD, Professor of Computational Biology, University of Edinburgh , Professor of Computational Biology , School of Informatics , University of Edinburgh

Welcome Reception in the Exhibit Hall with Poster Viewing

Close of Leveraging Data Science for Enhanced Expression and Production Conference


For more details on the conference, please contact:

Mary Ann Brown
Executive Director, Conferences
Cambridge Healthtech Institute
Phone: (+1) 781-697-7687
Email: mabrown@healthtech.com

For sponsorship information, please contact:

Companies A-K
Jason Gerardi
Sr. Manager, Business Development
Cambridge Healthtech Institute
Phone: (+1) 781-972-5452
Email: jgerardi@healthtech.com

Companies L-Z
Ashley Parsons
Manager, Business Development
Cambridge Healthtech Institute
Phone: (+1) 781-972-1340
Email: ashleyparsons@healthtech.com