PoinnCARE: Hyperbolic Multi-Modal Learning for Enzyme Classification

ICLR 2026 Conference SubmissionAnonymous Authors
EC number predictionenzyme functionhyperbolic space learningmulti-modal learningenzyme structureenzyme active site
Abstract:

Enzyme Commission (EC) number prediction is vital for elucidating enzyme functions and advancing biotechnology applications. However, current methods struggle to capture the hierarchical relationships among enzymes and often overlook critical structural and active site features. To bridge this gap, we introduce PoinnCARE, a novel framework that jointly encodes and aligns multi-modal data from enzyme sequences, structures, and active sites in hyperbolic space. By integrating graph diffusion and alignment techniques, PoinnCARE mitigates data sparsity and enriches functional representations, while hyperbolic embedding preserves the intrinsic hierarchy of the EC system with theoretical guarantees in low-dimensional spaces. Extensive experiments on four datasets from the CARE benchmark demonstrate that PoinnCARE consistently and significantly outperforms state-of-the-art methods in EC number prediction.

Disclaimer
This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.
NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.
If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper introduces PoinnCARE, a framework that jointly encodes enzyme sequences, structures, and active sites in hyperbolic space for EC number prediction. It resides in the 'Multimodal and Hybrid Prediction Models' leaf, which contains four papers total (including PoinnCARE). This leaf sits within the broader 'Prediction Models and Architectures' branch, indicating a moderately populated research direction focused on integrating multiple data modalities. The taxonomy shows that multimodal approaches represent one of several parallel strategies, alongside sequence-only, structure-only, and hierarchical learning frameworks.

The taxonomy reveals neighboring leaves such as 'Hierarchical and Multitask Learning Frameworks' (three papers) and 'Structure-Based Prediction Models' (five papers), suggesting that PoinnCARE bridges structural modeling with multimodal integration. The 'Representation Learning and Embedding Methods' branch (two leaves, three papers) addresses complementary questions about encoding EC numbers and proteins, while 'Reaction-Based and Chemical Transformation Methods' (five papers) explores an orthogonal direction using substrate-product information. PoinnCARE's hyperbolic embedding approach diverges from standard Euclidean representations common in sibling papers, positioning it at the intersection of geometric representation learning and multimodal fusion.

Among 30 candidates examined, the analysis identifies limited prior work overlap. The core hyperbolic framework contribution (Contribution 1) examined 10 candidates with zero refutations, suggesting relative novelty in applying hyperbolic geometry to enzyme prediction. However, multi-modal dataset augmentation (Contribution 2) and graph diffusion for sparsity (Contribution 3) each found one refutable candidate among 10 examined, indicating that structural and active site integration, as well as graph-based augmentation techniques, have precedents in the limited search scope. The statistics reflect a focused semantic search rather than exhaustive coverage.

Based on the limited search scope of 30 semantically similar papers, PoinnCARE appears to occupy a relatively novel position by combining hyperbolic embeddings with multi-modal enzyme data. The taxonomy context shows a moderately crowded multimodal prediction space but sparse exploration of non-Euclidean geometries. The analysis does not capture potential overlaps outside the top-30 semantic matches or in adjacent fields like graph representation learning, leaving open questions about broader precedents for hyperbolic enzyme embeddings.

Taxonomy

Core-task Taxonomy Papers
50
3
Claimed Contributions
30
Contribution Candidate Papers Compared
2
Refutable Paper

Research Landscape Overview

Core task: Enzyme Commission number prediction aims to assign standardized functional labels to enzymes based on their catalytic activities, typically using sequence or structural information. The field has evolved into a rich landscape organized around several complementary directions. Representation Learning and Embedding Methods explore how to encode protein sequences and structures into informative feature spaces, often leveraging protein language models or geometric embeddings. Prediction Models and Architectures encompass a diverse array of neural network designs—ranging from convolutional and recurrent architectures to transformers and hybrid multimodal frameworks—that map these representations to EC labels. Reaction-Based and Chemical Transformation Methods focus on substrate and product information to infer enzymatic function, while Specialized Prediction Tasks address hierarchical classification, domain-level annotation, and enzyme discovery. Benchmarking, Evaluation, and Comparative Studies provide systematic assessments of model performance, and Computational Assignment and Database Methods support large-scale annotation pipelines. Theoretical Foundations examine the intrinsic limits of function prediction, and Application-Driven approaches target real-world scenarios such as metagenomics or drug discovery. Recent work has intensified around multimodal and hybrid prediction strategies that combine sequence embeddings with structural or chemical context. For instance, PoinnCARE[0] integrates multiple data modalities to improve prediction robustness, positioning itself within the Multimodal and Hybrid Prediction Models branch alongside efforts like Autoregressive Enzyme Prediction[38] and SST-ResNet[39], which explore sequential decoding and residue-level feature extraction respectively. These approaches contrast with purely sequence-based models such as EC2Vec[1] or transformer-only designs like Transformer Enzyme Classification[47], highlighting a trade-off between model complexity and interpretability. Meanwhile, benchmarking initiatives like EC-Bench[2] and assessments of protein language models (Protein Language Models Assessment[11]) underscore ongoing questions about generalization across enzyme families and the practical limits of data-driven methods. PoinnCARE[0] thus sits at the intersection of representation fusion and architectural innovation, aiming to leverage complementary signals where single-modality methods may plateau.

Claimed Contributions

PoinnCARE framework for hyperbolic multi-modal enzyme learning

The authors propose PoinnCARE, a framework that integrates sequence, structure, and active site information of enzymes and represents them in hyperbolic space. This approach preserves the hierarchical EC taxonomy structure while capturing comprehensive enzyme characteristics through multi-modal learning and alignment.

10 retrieved papers
Multi-modal dataset augmentation with structural and active site information

The authors extend the existing CARE benchmark by adding structural information from PDB and AlphaFold2/ESMFold predictions, along with active site annotations from UniProt. This augmentation transforms the single-modality benchmark into a multi-modal dataset for enzyme classification.

10 retrieved papers
Can Refute
Graph diffusion mechanism for addressing annotation sparsity

The authors develop pairwise similarity graphs for structure and active site modalities, then apply graph diffusion operations to mitigate data sparsity by incorporating both direct and indirect connections. This approach enriches functional representations despite incomplete modality information.

10 retrieved papers
Can Refute

Core Task Comparisons

Comparisons with papers in the same taxonomy category

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

PoinnCARE framework for hyperbolic multi-modal enzyme learning

The authors propose PoinnCARE, a framework that integrates sequence, structure, and active site information of enzymes and represents them in hyperbolic space. This approach preserves the hierarchical EC taxonomy structure while capturing comprehensive enzyme characteristics through multi-modal learning and alignment.

Contribution

Multi-modal dataset augmentation with structural and active site information

The authors extend the existing CARE benchmark by adding structural information from PDB and AlphaFold2/ESMFold predictions, along with active site annotations from UniProt. This augmentation transforms the single-modality benchmark into a multi-modal dataset for enzyme classification.

Contribution

Graph diffusion mechanism for addressing annotation sparsity

The authors develop pairwise similarity graphs for structure and active site modalities, then apply graph diffusion operations to mitigate data sparsity by incorporating both direct and indirect connections. This approach enriches functional representations despite incomplete modality information.