PoinnCARE: Hyperbolic Multi-Modal Learning for Enzyme Classification

ICLR 2026 Conference SubmissionAnonymous Authors

OpenReview Score: 6.7 Download Report PDF

EC number predictionenzyme functionhyperbolic space learningmulti-modal learningenzyme structureenzyme active site

Enzyme Commission (EC) number prediction is vital for elucidating enzyme functions and advancing biotechnology applications. However, current methods struggle to capture the hierarchical relationships among enzymes and often overlook critical structural and active site features. To bridge this gap, we introduce PoinnCARE, a novel framework that jointly encodes and aligns multi-modal data from enzyme sequences, structures, and active sites in hyperbolic space. By integrating graph diffusion and alignment techniques, PoinnCARE mitigates data sparsity and enriches functional representations, while hyperbolic embedding preserves the intrinsic hierarchy of the EC system with theoretical guarantees in low-dimensional spaces. Extensive experiments on four datasets from the CARE benchmark demonstrate that PoinnCARE consistently and significantly outperforms state-of-the-art methods in EC number prediction.

Abstract:

Disclaimer

This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.

NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.

If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper introduces PoinnCARE, a framework that jointly encodes enzyme sequences, structures, and active sites in hyperbolic space for EC number prediction. It resides in the 'Multimodal and Hybrid Prediction Models' leaf, which contains four papers total (including PoinnCARE). This leaf sits within the broader 'Prediction Models and Architectures' branch, indicating a moderately populated research direction focused on integrating multiple data modalities. The taxonomy shows that multimodal approaches represent one of several parallel strategies, alongside sequence-only, structure-only, and hierarchical learning frameworks.

The taxonomy reveals neighboring leaves such as 'Hierarchical and Multitask Learning Frameworks' (three papers) and 'Structure-Based Prediction Models' (five papers), suggesting that PoinnCARE bridges structural modeling with multimodal integration. The 'Representation Learning and Embedding Methods' branch (two leaves, three papers) addresses complementary questions about encoding EC numbers and proteins, while 'Reaction-Based and Chemical Transformation Methods' (five papers) explores an orthogonal direction using substrate-product information. PoinnCARE's hyperbolic embedding approach diverges from standard Euclidean representations common in sibling papers, positioning it at the intersection of geometric representation learning and multimodal fusion.

Among 30 candidates examined, the analysis identifies limited prior work overlap. The core hyperbolic framework contribution (Contribution 1) examined 10 candidates with zero refutations, suggesting relative novelty in applying hyperbolic geometry to enzyme prediction. However, multi-modal dataset augmentation (Contribution 2) and graph diffusion for sparsity (Contribution 3) each found one refutable candidate among 10 examined, indicating that structural and active site integration, as well as graph-based augmentation techniques, have precedents in the limited search scope. The statistics reflect a focused semantic search rather than exhaustive coverage.

Based on the limited search scope of 30 semantically similar papers, PoinnCARE appears to occupy a relatively novel position by combining hyperbolic embeddings with multi-modal enzyme data. The taxonomy context shows a moderately crowded multimodal prediction space but sparse exploration of non-Euclidean geometries. The analysis does not capture potential overlaps outside the top-30 semantic matches or in adjacent fields like graph representation learning, leaving open questions about broader precedents for hyperbolic enzyme embeddings.

Taxonomy

Core-task Taxonomy Papers

Claimed Contributions

Contribution Candidate Papers Compared

Refutable Paper

Research Landscape Overview

Core task: Enzyme Commission number prediction aims to assign standardized functional labels to enzymes based on their catalytic activities, typically using sequence or structural information. The field has evolved into a rich landscape organized around several complementary directions. Representation Learning and Embedding Methods explore how to encode protein sequences and structures into informative feature spaces, often leveraging protein language models or geometric embeddings. Prediction Models and Architectures encompass a diverse array of neural network designs—ranging from convolutional and recurrent architectures to transformers and hybrid multimodal frameworks—that map these representations to EC labels. Reaction-Based and Chemical Transformation Methods focus on substrate and product information to infer enzymatic function, while Specialized Prediction Tasks address hierarchical classification, domain-level annotation, and enzyme discovery. Benchmarking, Evaluation, and Comparative Studies provide systematic assessments of model performance, and Computational Assignment and Database Methods support large-scale annotation pipelines. Theoretical Foundations examine the intrinsic limits of function prediction, and Application-Driven approaches target real-world scenarios such as metagenomics or drug discovery. Recent work has intensified around multimodal and hybrid prediction strategies that combine sequence embeddings with structural or chemical context. For instance, PoinnCARE[0] integrates multiple data modalities to improve prediction robustness, positioning itself within the Multimodal and Hybrid Prediction Models branch alongside efforts like Autoregressive Enzyme Prediction[38] and SST-ResNet[39], which explore sequential decoding and residue-level feature extraction respectively. These approaches contrast with purely sequence-based models such as EC2Vec[1] or transformer-only designs like Transformer Enzyme Classification[47], highlighting a trade-off between model complexity and interpretability. Meanwhile, benchmarking initiatives like EC-Bench[2] and assessments of protein language models (Protein Language Models Assessment[11]) underscore ongoing questions about generalization across enzyme families and the practical limits of data-driven methods. PoinnCARE[0] thus sits at the intersection of representation fusion and architectural innovation, aiming to leverage complementary signals where single-modality methods may plateau.

Claimed Contributions

PoinnCARE framework for hyperbolic multi-modal enzyme learning

10 retrieved papers

The authors propose PoinnCARE, a framework that integrates sequence, structure, and active site information of enzymes and represents them in hyperbolic space. This approach preserves the hierarchical EC taxonomy structure while capturing comprehensive enzyme characteristics through multi-modal learning and alignment.

10 retrieved papers

Multi-modal dataset augmentation with structural and active site information

Can Refute

10 retrieved papers

The authors extend the existing CARE benchmark by adding structural information from PDB and AlphaFold2/ESMFold predictions, along with active site annotations from UniProt. This augmentation transforms the single-modality benchmark into a multi-modal dataset for enzyme classification.

10 retrieved papers

Can Refute

Graph diffusion mechanism for addressing annotation sparsity

Can Refute

10 retrieved papers

The authors develop pairwise similarity graphs for structure and active site modalities, then apply graph diffusion operations to mitigate data sparsity by incorporating both direct and indirect connections. This approach enriches functional representations despite incomplete modality information.

10 retrieved papers

Can Refute

Core Task Comparisons

Comparisons with papers in the same taxonomy category

[23] Multimodal Quantum Vision Transformer for Enzyme Commission Classification from Biochemical Representations PDF

Isik Murat, Saggi, Mandeep Kaur, Murat Isik, M. Saggi, Kais, Sabre, H. Gowher, S. Kais (2025)

[38] Autoregressive enzyme function prediction with multi-scale multi-modality fusion PDF

Dingyi Rong, Bozitao Zhong, Wenzhuo Zheng, Liang, Hong, Ning Liu, Liang Hong (2025)

[39] SST-ResNet: A Sequence and Structure Information Integration Model for Protein Property Prediction PDF

Guo-Wei Zhou, Yanpeng Zhao, Guowei Zhou, Song He, Xiaochen Bo (2025)

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

PoinnCARE framework for hyperbolic multi-modal enzyme learning

[61] OneProt: Towards multi-modal protein foundation models via latent space alignment of sequence, structure, binding sites and text encoders PDF

Cannot Refute

[62] Atom level enzyme active site scaffolding using RFdiffusion2 PDF

Cannot Refute

[63] A center-anchored adaptive hierarchical graph neural network with application in structure-aware recognition of enzyme catalytic specificity PDF

Cannot Refute

[64] Multi-modal deep learning enables efficient and accurate annotation of enzymatic active sites PDF

Cannot Refute

[65] MMSite: A Multi-modal Framework for the Identification of Active Sites in Proteins PDF

Cannot Refute

[66] OneProt: Towards multi-modal protein foundation models PDF

Cannot Refute

[67] Bidirectional Hierarchical Protein Multi-Modal Representation Learning PDF

Cannot Refute

[68] A multimodal Transformer Network for protein-small molecule interactions enhances predictions of kinase inhibition and enzyme-substrate relationships PDF

Cannot Refute

[69] A Highly Sensitive Model Based on Graph Neural Networks for Enzyme Key Catalytic Residue Prediction PDF

Cannot Refute

[70] TUNA: A Target-aware Unified Network for Protein-Ligand Binding Affinity Prediction via Multi-Modal Feature Integration. PDF

Cannot Refute

Contribution

Multi-modal dataset augmentation with structural and active site information

[64] Multi-modal deep learning enables efficient and accurate annotation of enzymatic active sites PDF

Can Refute

[38] Autoregressive enzyme function prediction with multi-scale multi-modality fusion PDF

Cannot Refute

[69] A Highly Sensitive Model Based on Graph Neural Networks for Enzyme Key Catalytic Residue Prediction PDF

Cannot Refute

[71] Protein functional site annotation using local structure embeddings PDF

Cannot Refute

[72] Predicting enzymatic function of protein sequences with attention PDF

Cannot Refute

[73] The ComputerâAssisted Sequence Annotation (CASA) workflow for enzyme discovery PDF

Cannot Refute

[74] EnzyMine: a comprehensive database for enzyme function annotation with enzymatic reaction chemical feature PDF

Cannot Refute

[75] Structure-based activity prediction for an enzyme of unknown function PDF

Cannot Refute

[76] Enzyme active sites: Identification and prediction of function using computational chemistry PDF

Cannot Refute

[77] SEFP: Structure-Based Enzyme Function Prediction PDF

Cannot Refute

Contribution

Graph diffusion mechanism for addressing annotation sparsity

[52] Fast and accurate protein function prediction from sequence through pretrained language model and homology-based label diffusion PDF

Can Refute

[51] Graph Diffusion Network for Drug-Gene Prediction PDF

Cannot Refute

[53] DRGAT: Predicting Drug Responses Via Diffusion-Based Graph Attention Network PDF

Cannot Refute

[54] Single-cell RNA sequencing data imputation using bi-level feature propagation PDF

Cannot Refute

[55] Label Diffusion Graph Learning network for semi-supervised breast histological image recognition PDF

Cannot Refute

[56] BGMSDDA: A bipartite graph diffusion algorithm with multiple similarity integration for drugâdisease association prediction PDF

Cannot Refute

[57] Exploiting ontology graph for predicting sparsely annotated gene function PDF

Cannot Refute

[58] LapDDPM: A Conditional Graph Diffusion Model for scRNA-seq Generation with Spectral Adversarial Perturbations PDF

Cannot Refute

[59] Semantically Consistent Discrete Diffusion for 3D Biological Graph Modeling PDF

Cannot Refute

[60] Normalized Laplacian Diffusion for Robust Cancer Pathway Extension and Critical Gene Identification from Limited Data PDF

Cannot Refute

PoinnCARE: Hyperbolic Multi-Modal Learning for Enzyme Classification

Overview

Overall Novelty Assessment

Taxonomy

Research Landscape Overview

Claimed Contributions

Core Task Comparisons

[23] Multimodal Quantum Vision Transformer for Enzyme Commission Classification from Biochemical Representations PDF

[38] Autoregressive enzyme function prediction with multi-scale multi-modality fusion PDF

[39] SST-ResNet: A Sequence and Structure Information Integration Model for Protein Property Prediction PDF

Contribution Analysis

PoinnCARE framework for hyperbolic multi-modal enzyme learning

[61] OneProt: Towards multi-modal protein foundation models via latent space alignment of sequence, structure, binding sites and text encoders PDF

[62] Atom level enzyme active site scaffolding using RFdiffusion2 PDF

[63] A center-anchored adaptive hierarchical graph neural network with application in structure-aware recognition of enzyme catalytic specificity PDF

[64] Multi-modal deep learning enables efficient and accurate annotation of enzymatic active sites PDF

[65] MMSite: A Multi-modal Framework for the Identification of Active Sites in Proteins PDF

[66] OneProt: Towards multi-modal protein foundation models PDF

[67] Bidirectional Hierarchical Protein Multi-Modal Representation Learning PDF

[68] A multimodal Transformer Network for protein-small molecule interactions enhances predictions of kinase inhibition and enzyme-substrate relationships PDF

[69] A Highly Sensitive Model Based on Graph Neural Networks for Enzyme Key Catalytic Residue Prediction PDF

[70] TUNA: A Target-aware Unified Network for Protein-Ligand Binding Affinity Prediction via Multi-Modal Feature Integration. PDF

Multi-modal dataset augmentation with structural and active site information

[64] Multi-modal deep learning enables efficient and accurate annotation of enzymatic active sites PDF

[38] Autoregressive enzyme function prediction with multi-scale multi-modality fusion PDF

[69] A Highly Sensitive Model Based on Graph Neural Networks for Enzyme Key Catalytic Residue Prediction PDF

[71] Protein functional site annotation using local structure embeddings PDF

[72] Predicting enzymatic function of protein sequences with attention PDF

[73] The ComputerâAssisted Sequence Annotation (CASA) workflow for enzyme discovery PDF

[74] EnzyMine: a comprehensive database for enzyme function annotation with enzymatic reaction chemical feature PDF

[75] Structure-based activity prediction for an enzyme of unknown function PDF

[76] Enzyme active sites: Identification and prediction of function using computational chemistry PDF

[77] SEFP: Structure-Based Enzyme Function Prediction PDF

Graph diffusion mechanism for addressing annotation sparsity

[52] Fast and accurate protein function prediction from sequence through pretrained language model and homology-based label diffusion PDF

[51] Graph Diffusion Network for Drug-Gene Prediction PDF

[53] DRGAT: Predicting Drug Responses Via Diffusion-Based Graph Attention Network PDF

[54] Single-cell RNA sequencing data imputation using bi-level feature propagation PDF

[55] Label Diffusion Graph Learning network for semi-supervised breast histological image recognition PDF

[56] BGMSDDA: A bipartite graph diffusion algorithm with multiple similarity integration for drugâdisease association prediction PDF

[57] Exploiting ontology graph for predicting sparsely annotated gene function PDF

[58] LapDDPM: A Conditional Graph Diffusion Model for scRNA-seq Generation with Spectral Adversarial Perturbations PDF

[59] Semantically Consistent Discrete Diffusion for 3D Biological Graph Modeling PDF

[60] Normalized Laplacian Diffusion for Robust Cancer Pathway Extension and Critical Gene Identification from Limited Data PDF

Table of Contents

[73] The ComputerâAssisted Sequence Annotation (CASA) workflow for enzyme discovery PDF

[56] BGMSDDA: A bipartite graph diffusion algorithm with multiple similarity integration for drugâdisease association prediction PDF