Synthetic data-augmented machine learning approaches for tailor-made microbial conversion of methane to phytoene

  • Chang Keun Kang
  • , Jihoon Shin
  • , Min Sun Kim
  • , Min Sun Choi
  • , Yoon Kyung Cha
  • , Yong Jun Choi

Research output: Contribution to journalArticlepeer-review

Abstract

Metabolic engineering has become a critical tool for biosynthesizing valuable compounds, yet its progress is frequently constrained by labor-intensive, trial-and-error methods. Here, a machine learning (ML)-assisted predictive framework enhanced with synthetic data generation method was developed to systematically optimize the metabolic pathway responsible for biosynthesis of phytoene from methane in the non-model methanotroph, Methylocystis sp. MJC1. To effectively balance metabolic flux and maximize phytoene biosynthesis, three key genes (dxs, crtE, and crtB) involved in the methylerythritol 4-phosphate (MEP) and carotenoid pathways were targeted for modulation. These genes were expressed under promoters with systematically varied strengths, creating a diverse experimental dataset used to train ML models. ML algorithms, including deep neural networks (DNN) and support vector machines (SVM), predicted optimal promoter-gene combinations to maximize phytoene production. To overcome the inherent data limitations of working with non-model organisms, conditional tabular generative adversarial networks (CTGAN) were employed, effectively generating synthetic data to enhance DNN prediction accuracy. Experimental validation confirmed that the ML-guided engineered strain exhibited a 2.2-fold improvement in phytoene production and a 1.5-fold increase in content compared to the base strain, clearly demonstrating successful pathway optimization. This study showcases the effectiveness of integrating ML-driven predictive frameworks with metabolic engineering approaches, enabling rapid, efficient, and precise optimization of microbial bioconversion processes utilizing methane as a sustainable feedstock.

Original languageEnglish
Article number133160
JournalBioresource Technology
Volume437
DOIs
StatePublished - Dec 2025

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

  1. SDG 7 - Affordable and Clean Energy
    SDG 7 Affordable and Clean Energy

Keywords

  • Generative adversarial networks
  • Machine learning
  • Metabolic engineering
  • Methane
  • Phytoene

Fingerprint

Dive into the research topics of 'Synthetic data-augmented machine learning approaches for tailor-made microbial conversion of methane to phytoene'. Together they form a unique fingerprint.

Cite this