In a groundbreaking development, a collaborative research team led by experts from KAIST and UCSD has introduced DeepECtransformer, an artificial intelligence designed to predict enzyme functions from protein sequences. The team successfully deployed the AI to uncover 464 types of enzymes from previously uncharted proteins within E. coli, demonstrating its potential in enzyme discovery.

Key Highlights:

  1. AI-Driven Enzyme Discovery:

    • E. coli, a extensively studied organism, harbors proteins with undisclosed functions. The research team leveraged artificial intelligence to identify 464 enzyme types from these unknown proteins.
  2. Validation Through In Vitro Assay:

    • The team validated their AI predictions by conducting in vitro enzyme assays, confirming the accuracy of predictions for three identified proteins.
  3. DeepECtransformer Architecture:

    • DeepECtransformer employs deep learning and a protein homology analysis module for enzyme function prediction.
    • Transformer architecture, commonly used in natural language processing, enhances feature extraction from protein sequences, facilitating accurate EC number predictions.
    • The AI successfully predicted a total of 5360 EC numbers.
  4. Addressing the Black Box Problem:

    • Unlike some existing AI-based prediction systems, DeepECtransformer allows for the interpretation of the inference process at a fine-grained level.
    • The AI autonomously identifies critical features, such as catalytic active sites and cofactor binding sites, during the learning process.
  5. Implications for Functional Genomics:

    • DeepECtransformer opens avenues for more precise analyses of metabolic processes within organisms.
    • Enables the identification of previously unknown enzymes, facilitating a comprehensive understanding of biosynthetic pathways and aiding in applications like biodegradation of plastics.
  6. Future Applications:

    • The AI's ability to rapidly and accurately predict enzyme functions positions it as a crucial technology in functional genomics.
    • Potential applications include the development of eco-friendly microbial factories based on comprehensive genome-scale metabolic models.

Quoting the Experts: Gi Bae Kim, the first author of the paper, expressed, "By utilizing the prediction system we developed, we were able to predict the functions of enzymes that had not yet been identified and verify them experimentally." Professor Sang Yup Lee highlighted, "DeepECtransformer, which quickly and accurately predicts enzyme functions, is a key technology in functional genomics, enabling us to analyze the function of entire enzymes at the systems level."

This pioneering research marks a significant stride in the intersection of artificial intelligence and biochemical discovery, promising transformative implications for various fields, including biotechnology and environmental sustainability.