Deep Learning with Learnable Product-Structured Activations

ICLR 2026 Conference SubmissionAnonymous Authors
deep learning architectureimplicit neural representationlow-rank tensor decompositionpartial differential equations
Abstract:

Modern neural architectures are fundamentally constrained by their reliance on fixed activation functions, limiting their ability to adapt representations to task-specific structure and efficiently capture high-order interactions. We introduce deep low-rank separated neural networks (LRNNs), a novel architecture generalizing MLPs that achieves enhanced expressivity by learning adaptive, factorized activation functions. LRNNs generalize the core principles underpinning continuous low-rank function decomposition to the setting of deep learning, constructing complex, high-dimensional neuron activations through a multiplicative composition of simpler, learnable univariate transformations. This product structure inherently captures multiplicative interactions and allows each LRNN neuron to learn highly flexible, data-dependent activation functions. We provide a detailed theoretical analysis that establishes the universal approximation property of LRNNs and reveals why they are capable of excellent empirical performance. Specifically, we show that LRNNs can mitigate the curse of dimensionality for functions with low-rank structure. Moreover, the learnable product-structured activations enable LRNNs to adaptively control their spectral bias, crucial for signal representation tasks. These theoretical insights are validated through extensive experiments where LRNNs achieve state-of-the-art performance across diverse domains including image and audio representation, numerical solution of PDEs, sparse-view CT reconstruction, and supervised learning tasks. Our results demonstrate that LRNNs provide a powerful and versatile building block with a distinct inductive bias for learning compact yet expressive representations.

Disclaimer
This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.
NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.
If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper introduces deep low-rank separated neural networks (LRNNs), which learn adaptive activation functions through multiplicative composition of univariate transformations. According to the taxonomy tree, this work occupies the 'Low-Rank Factorized Activations' leaf under 'Product-Structured Activation Functions'. Notably, this leaf contains only the original paper itself—no sibling papers are present. This isolation suggests the specific combination of low-rank factorization with learnable product-structured activations represents a relatively unexplored niche within the broader field of adaptive activation functions.

The taxonomy reveals that neighboring research directions include 'Fixed Polynomial Product Activations' and 'Logarithmic Product Transformations' within the same parent branch, plus 'Learnable Parametric Activation Functions' in a parallel branch. The scope notes clarify that fixed polynomial approaches lack learnable factorization, while parametric methods avoid product structures entirely. LRNNs appear positioned at the intersection of these themes—combining the adaptivity of learnable parametric activations with the multiplicative interaction modeling of product structures, but through a factorized lens that distinguishes it from both polynomial expansions and simple parameterization.

Among thirty candidates examined, the contribution-level analysis shows mixed novelty signals. The core LRNN architecture and theoretical analysis each examined ten candidates with zero refutations, suggesting these aspects face limited direct prior work within the search scope. However, the variance-controlled initialization mechanism examined ten candidates and found one refutable match, indicating this component has more substantial overlap with existing techniques. The limited search scale means these findings reflect top-thirty semantic matches rather than exhaustive coverage, so unexamined literature may contain additional relevant work.

Given the sparse taxonomy leaf and low refutation rates across most contributions, the work appears to occupy a genuinely underexplored intersection of low-rank methods and adaptive activations. The initialization component shows expected overlap with standard neural network practices. The analysis is constrained by examining only thirty candidates from semantic search, leaving open the possibility of relevant work in adjacent subfields not captured by this retrieval strategy.

Taxonomy

Core-task Taxonomy Papers
8
3
Claimed Contributions
30
Contribution Candidate Papers Compared
1
Refutable Paper

Research Landscape Overview

Core task: learning adaptive product-structured activation functions in neural networks. The field explores how neural networks can move beyond fixed, element-wise nonlinearities by learning activation functions that adapt to data and exploit multiplicative or product-based structures. The taxonomy organizes this landscape into three main branches. Learnable Parametric Activation Functions encompasses methods that parameterize activation shapes with trainable coefficients, allowing networks to discover problem-specific nonlinearities rather than relying on hand-crafted choices like ReLU or sigmoid. Product-Structured Activation Functions focuses on architectures that explicitly incorporate multiplicative interactions—such as low-rank factorizations or polynomial expansions—to capture richer feature dependencies. Attention-Based Product Mechanisms examines how gating and attention operations introduce learned product terms that modulate representations. Representative works illustrate these themes: REAct[1] and Learning Activation Functions[3] exemplify parametric approaches, while Ladder Polynomial Neural Networks[5] and Kernel Product Neural Networks[7] demonstrate polynomial and kernel-based product structures. Several active lines of work highlight trade-offs between expressiveness and computational cost. Parametric activation methods offer flexibility but require careful regularization to avoid overfitting, whereas product-structured designs can model complex interactions yet may introduce additional parameters or computational overhead. Deep Learning with Learnable[0] sits within the Product-Structured Activation Functions branch, specifically targeting low-rank factorized activations. This emphasis on factorization distinguishes it from polynomial expansions like Ladder Polynomial Neural Networks[5], which build higher-order terms explicitly, and from kernel-based approaches such as Kernel Product Neural Networks[7], which leverage kernel tricks for feature interactions. By focusing on low-rank structures, Deep Learning with Learnable[0] aims to balance expressive power with parameter efficiency, addressing a central challenge in adaptive activation design.

Claimed Contributions

Deep low-rank separated neural networks (LRNNs) architecture

The authors propose LRNNs, a new neural network architecture that generalizes MLPs by replacing fixed scalar activations with learnable product-structured activation functions. Each LRNN neuron learns a flexible, data-dependent activation through multiplicative composition of simpler univariate transformations, enabling adaptive non-linearities and efficient capture of high-order interactions.

10 retrieved papers
Theoretical analysis of LRNNs

The authors establish theoretical foundations for LRNNs, proving universal approximation capabilities and demonstrating that LRNNs can overcome the curse of dimensionality for functions with decaying functional ANOVA structure. They also show that learnable product-structured activations enable adaptive control of spectral bias, which is crucial for signal representation tasks.

10 retrieved papers
Variance-controlled initialization mechanism

The authors introduce a scaling mechanism that ensures stable gradient flow through arbitrarily wide product structures. This mechanism bounds the variance of LRNN activations and gradients independently of projection width, enabling automatic relevance determination and stable optimization even for wide product structures.

10 retrieved papers
Can Refute

Core Task Comparisons

Comparisons with papers in the same taxonomy category

Within the taxonomy built over the current TopK core-task papers, the original paper is assigned to a leaf with no direct siblings and no cousin branches under the same grandparent topic. In this retrieved landscape, it appears structurally isolated, which is one partial signal of novelty, but still constrained by search coverage and taxonomy granularity.

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

Deep low-rank separated neural networks (LRNNs) architecture

The authors propose LRNNs, a new neural network architecture that generalizes MLPs by replacing fixed scalar activations with learnable product-structured activation functions. Each LRNN neuron learns a flexible, data-dependent activation through multiplicative composition of simpler univariate transformations, enabling adaptive non-linearities and efficient capture of high-order interactions.

Contribution

Theoretical analysis of LRNNs

The authors establish theoretical foundations for LRNNs, proving universal approximation capabilities and demonstrating that LRNNs can overcome the curse of dimensionality for functions with decaying functional ANOVA structure. They also show that learnable product-structured activations enable adaptive control of spectral bias, which is crucial for signal representation tasks.

Contribution

Variance-controlled initialization mechanism

The authors introduce a scaling mechanism that ensures stable gradient flow through arbitrarily wide product structures. This mechanism bounds the variance of LRNN activations and gradients independently of projection width, enabling automatic relevance determination and stable optimization even for wide product structures.