xRFM: Accurate, scalable, and interpretable feature learning models for tabular data

ICLR 2026 Conference SubmissionAnonymous Authors

OpenReview Score: 6.5 Download Report PDF

Tabular datakernel methodstree-based methods

Inference from tabular data, collections of continuous and categorical variables organized into matrices, is a foundation for modern technology and science. Yet, in contrast to the explosive changes in the rest of AI, the best practice for these predictive tasks has been relatively unchanged and is still primarily based on variations of Gradient Boosted Decision Trees (GBDTs). Very recently, there has been renewed interest in developing state-of-the-art methods for tabular data based on recent developments in neural networks and feature learning methods. In this work, we introduce xRFM, an algorithm that combines feature learning kernel machines with a tree structure to both adapt to the local structure of the data and scale to essentially unlimited amounts of training data. We show that compared to $31$ other methods, including recently introduced tabular foundation models (TabPFN-v2) and GBDTs, xRFM achieves best performance across $100$ regression datasets and is competitive to the best methods across $200$ classification datasets outperforming GBDTs. Additionally, xRFM provides interpretability natively through the Average Gradient Outer Product.

Abstract:

Disclaimer

This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.

NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.

If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper introduces xRFM, an algorithm combining feature learning kernel machines with an adaptive tree structure for tabular prediction. It resides in the 'Specialized Neural Network Designs' leaf under 'Deep Learning Architectures for Tabular Data', alongside three sibling papers. This leaf represents a moderately populated research direction within a broader taxonomy of 50 papers across approximately 36 topics, suggesting a focused but not overcrowded niche. The work targets general-purpose tabular prediction, contrasting with domain-specific neighbors like medical interpretability methods.

The taxonomy reveals that xRFM's leaf sits within a larger branch exploring neural architectures for tabular data, which includes attention-based transformers and graph neural networks as sibling leaves. Neighboring branches address foundation models, generative pre-training, and retrieval-augmented methods. The 'Specialized Neural Network Designs' scope explicitly excludes general transformers and graph-based methods, positioning xRFM as a custom architecture with domain-specific inductive biases. This placement suggests the work diverges from mainstream transformer adaptations, instead pursuing hybrid kernel-tree designs that balance expressiveness with tabular data constraints.

Among 22 candidates examined across three contributions, the core xRFM algorithm (9 candidates, 0 refutable) and Leaf RFM component (3 candidates, 0 refutable) show no clear prior work overlap within the limited search scope. However, the interpretability contribution via Average Gradient Outer Product (10 candidates, 2 refutable) encounters more substantial prior work. The statistics indicate that the architectural novelty appears stronger than the interpretability mechanism, though the search scale—22 candidates total—means this assessment reflects top-K semantic matches rather than exhaustive coverage. The refutable pairs suggest existing gradient-based interpretability methods may overlap with the proposed approach.

Based on the limited literature search of 22 candidates, the xRFM architecture appears relatively novel within its specialized design niche, while the interpretability component shows more overlap with existing gradient-based methods. The taxonomy context indicates this work occupies a moderately explored research direction, distinct from mainstream transformer or foundation model approaches. A more exhaustive search beyond top-K semantic matches would be needed to fully assess novelty across the broader tabular prediction landscape.

Taxonomy

Core-task Taxonomy Papers

Claimed Contributions

Contribution Candidate Papers Compared

Refutable Paper

Research Landscape Overview

Core task: predictive modeling on tabular data. The field has evolved into several major branches that reflect both methodological diversity and application-driven specialization. Deep learning architectures for tabular data explore neural network designs tailored to structured features, often contrasting with traditional gradient-boosted trees. Foundation models and transfer learning investigate how pretrained representations can generalize across datasets, while generative modeling and data augmentation address data scarcity through synthetic sample creation. Specialized prediction tasks span domains from healthcare to finance, and benchmarking studies rigorously compare methods across standardized datasets. Survey and review literature synthesizes these developments, time series forecasting extends tabular methods to temporal data, and structured prediction frameworks provide theoretical grounding. Representative works include Deep Neural Tabular Survey[3] and Tabular Representation Survey[5], which map out architectural trends, and Generative Tabular Review[2], which examines synthetic data strategies. Within the deep learning architectures branch, specialized neural network designs have emerged to handle tabular data's unique challenges—heterogeneous feature types, missing values, and the need for interpretability. Some approaches like Goggle[4] and Neural Nets Boosted Trees[7] blend neural components with tree-based methods, while others such as Tiny Classifier Circuits[16] and Regularization Learning Networks[18] focus on compact or regularized architectures. The original paper xRFM[0] sits within this specialized design cluster, emphasizing novel network structures that balance expressiveness with tabular data constraints. Compared to neighbors like Prototype Parts Medical[22], which targets interpretable medical predictions, xRFM[0] appears more focused on general-purpose architectural innovation. This line of work continues to grapple with whether deep learning can consistently outperform classical methods, a debate highlighted by Deep Learning Enough[21] and Deep Learning Not Enough[35], and whether specialized designs can bridge the performance gap observed in many tabular benchmarks.

Claimed Contributions

xRFM algorithm combining feature learning kernel machines with adaptive tree structure

9 retrieved papers

The authors propose xRFM, a tabular prediction method that integrates Recursive Feature Machines with binary tree-based data partitioning. This enables local feature learning (learning different features for different data subsets) while achieving O(n log n) training complexity and O(log n) inference complexity.

9 retrieved papers

Leaf RFM: improved kernel-RFM for tabular data

3 retrieved papers

The authors develop leaf RFM, an enhanced version of kernel-RFM that uses a more general class of kernels and optionally applies only the diagonal of the AGOP. These modifications introduce axis-aligned bias suitable for tabular data structure and enable better coordinate selection.

3 retrieved papers

Native interpretability through Average Gradient Outer Product

Can Refute

10 retrieved papers

The authors show that xRFM offers built-in interpretability by exposing learned features through AGOP matrices at each leaf. The diagonal entries indicate coordinate relevance while top eigenvectors reveal directions in data most relevant for prediction, enabling understanding of heterogeneous feature importance across data subpopulations.

10 retrieved papers

Can Refute

Core Task Comparisons

Comparisons with papers in the same taxonomy category

[16] Low-cost and efficient prediction hardware for tabular data using tiny classifier circuits PDF

Konstantinos Iordanou, Timothy Atkinson, Emre Ozer, Jedrzej Kufel, Grace Aligada, John Biggs, Gavin Brown, Mikel Lujan, Mikel LujÃ¡n (2024) • Nature Electronics

[18] Regularization learning networks: deep learning for tabular datasets PDF

Shavitt, Ira, Segal, Eran (2018)

[22] An interpretable prototype parts-based neural network for medical tabular data PDF

J Karolczak, J Stefanowski (2025)

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

xRFM algorithm combining feature learning kernel machines with adaptive tree structure

[51] Leveraging Structural Information in Tree Ensembles for Table Representation Learning PDF

Cannot Refute

[52] Embracing uncertainty flexibility: harnessing a supervised tree kernel to empower ensemble modelling for 2D echocardiography-based prediction of right ventricular volume PDF

Cannot Refute

[53] Geodesic Flow Kernels for Semi-Supervised Learning on Mixed-Variable Tabular Dataset PDF

Cannot Refute

[54] Embracing uncertainty flexibility: harnessing a supervised tree kernel to empower ensemble modelling for 2D echocardiography-based prediction of right ventricular â¦ PDF

Cannot Refute

[55] Instance-based uncertainty estimation for gradient-boosted regression trees PDF

Cannot Refute

[56] Supervised contrastive representation learning with tree-structured parzen estimator Bayesian optimization for imbalanced tabular data PDF

Cannot Refute

[57] Autoencoding Random Forests PDF

Cannot Refute

[58] Navigating the Credit Landscape with Minimal Data: A Transfer Learning and Image-Based Classification Strategy PDF

Cannot Refute

[59] Tree-Regularized Tabular Embeddings PDF

Cannot Refute

Contribution

Leaf RFM: improved kernel-RFM for tabular data

[70] Neural tangent kernels for axis-aligned tree ensembles PDF

Cannot Refute

[71] Topological Activation Maps for Visual Representation Learning from Tabular Data PDF

Cannot Refute

[72] Optimally rotated coordinate systems for adaptive least-squares regression on sparse grids PDF

Cannot Refute

Contribution

Native interpretability through Average Gradient Outer Product

[60] Mechanism for feature learning in neural networks and backpropagation-free machine learning models PDF

Can Refute

[64] Mechanism of feature learning in deep fully connected networks and kernel machines that recursively learn features PDF

Can Refute

[61] Reversed Attention: On The Gradient Descent Of Attention Layers In GPT PDF

Cannot Refute

[62] Feature learning as alignment: a structural property of gradient descent in non-linear neural networks PDF

Cannot Refute

[63] Interpretable QSPR Modeling using Recursive Feature Machines and Multi-scale Fingerprints PDF

Cannot Refute

[65] Images as weight matrices: Sequential image generation through synaptic learning rules PDF

Cannot Refute

[66] Emergence in non-neural models: grokking modular arithmetic via average gradient outer product PDF

Cannot Refute

[67] Efficient Spike Timing Dependent Plasticity rule for Complex-Valued Neurons PDF

Cannot Refute

[68] Grokking Modular Arithmetic Through Group Actions: A Group-Theoretic View of Machine Learning Behavior PDF

Cannot Refute

[69] Jacobian Aligned Random Forests PDF

Cannot Refute

xRFM: Accurate, scalable, and interpretable feature learning models for tabular data

Overview

Overall Novelty Assessment

Taxonomy

Research Landscape Overview

Claimed Contributions

Core Task Comparisons

[16] Low-cost and efficient prediction hardware for tabular data using tiny classifier circuits PDF

[18] Regularization learning networks: deep learning for tabular datasets PDF

[22] An interpretable prototype parts-based neural network for medical tabular data PDF

Contribution Analysis

xRFM algorithm combining feature learning kernel machines with adaptive tree structure

[51] Leveraging Structural Information in Tree Ensembles for Table Representation Learning PDF

[52] Embracing uncertainty flexibility: harnessing a supervised tree kernel to empower ensemble modelling for 2D echocardiography-based prediction of right ventricular volume PDF

[53] Geodesic Flow Kernels for Semi-Supervised Learning on Mixed-Variable Tabular Dataset PDF

[54] Embracing uncertainty flexibility: harnessing a supervised tree kernel to empower ensemble modelling for 2D echocardiography-based prediction of right ventricular â¦ PDF

[55] Instance-based uncertainty estimation for gradient-boosted regression trees PDF

[56] Supervised contrastive representation learning with tree-structured parzen estimator Bayesian optimization for imbalanced tabular data PDF

[57] Autoencoding Random Forests PDF

[58] Navigating the Credit Landscape with Minimal Data: A Transfer Learning and Image-Based Classification Strategy PDF

[59] Tree-Regularized Tabular Embeddings PDF

Leaf RFM: improved kernel-RFM for tabular data

[70] Neural tangent kernels for axis-aligned tree ensembles PDF

[71] Topological Activation Maps for Visual Representation Learning from Tabular Data PDF

[72] Optimally rotated coordinate systems for adaptive least-squares regression on sparse grids PDF

Native interpretability through Average Gradient Outer Product

[60] Mechanism for feature learning in neural networks and backpropagation-free machine learning models PDF

[64] Mechanism of feature learning in deep fully connected networks and kernel machines that recursively learn features PDF

[61] Reversed Attention: On The Gradient Descent Of Attention Layers In GPT PDF

[62] Feature learning as alignment: a structural property of gradient descent in non-linear neural networks PDF

[63] Interpretable QSPR Modeling using Recursive Feature Machines and Multi-scale Fingerprints PDF

[65] Images as weight matrices: Sequential image generation through synaptic learning rules PDF

[66] Emergence in non-neural models: grokking modular arithmetic via average gradient outer product PDF

[67] Efficient Spike Timing Dependent Plasticity rule for Complex-Valued Neurons PDF

[68] Grokking Modular Arithmetic Through Group Actions: A Group-Theoretic View of Machine Learning Behavior PDF

[69] Jacobian Aligned Random Forests PDF

Table of Contents

[54] Embracing uncertainty flexibility: harnessing a supervised tree kernel to empower ensemble modelling for 2D echocardiography-based prediction of right ventricular â¦ PDF