Light Differentiable Logic Gate Networks

ICLR 2026 Conference SubmissionAnonymous Authors
reparameterizationlogic gate networksvanishing gradients
Abstract:

Differentiable logic gate networks (DLGNs) exhibit extraordinary efficiency at inference while sustaining competitive accuracy. But vanishing gradients, discretization errors, and high training cost impede scaling these networks. Even with dedicated parameter initialization schemes from subsequent works, increasing depth still harms accuracy. We show that the root cause of these issues lies in the underlying parametrization of logic gate neurons themselves. To overcome this issue, we propose a reparametrization that also shrinks the parameter size logarithmically in the number of inputs per gate. For binary inputs, this already reduces the model size by 4x, speeds up the backward pass by up to 1.86x, and converges in 8.5x fewer training steps. On top of that, we show that the accuracy on CIFAR-100 remains stable and sometimes superior to the original parametrization.

Disclaimer
This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.
NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.
If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper proposes a reparametrization of logic gate neurons in differentiable logic gate networks (DLGNs) to address vanishing gradients, discretization errors, and training inefficiency. It occupies the 'Gate-Level Reparametrization for Gradient Stability' leaf within the taxonomy, which currently contains only this work as a sibling. This positioning suggests the paper targets a relatively sparse research direction focused specifically on gate-level parameter reformulation rather than broader architectural or hybrid approaches.

The taxonomy reveals three main branches: Core Parametrization and Training Efficiency (where this work resides), Architectural Scaling and Interconnect Design, and Hybrid Differentiable Logic Systems. Neighboring leaves include 'Continuous Optimization Reformulations' (transforming discrete logic into continuous tasks) and 'Scalable Interconnect Learning' (addressing connectivity patterns). The paper's focus on gate-level numerics distinguishes it from interconnect-focused methods and hybrid probabilistic or algorithmic integrations, carving out a niche in foundational parametrization rather than compositional or integration challenges.

Among three candidates examined for the input-wise parametrization contribution, none were found to refute the approach. The analysis of gradient instability root causes and initialization scheme characterization received no candidate examination. Given the limited search scope of three total candidates, the absence of refutable prior work suggests either genuine novelty in this specific parametrization strategy or insufficient coverage of closely related DLGN training literature. The core reparametrization appears less explored than broader architectural or hybrid methods.

Based on the limited literature search (three candidates), the work appears to address an underexplored aspect of differentiable logic networks. The taxonomy structure confirms that gate-level parametrization receives less attention than architectural scaling or hybrid integration. However, the small search scope leaves open the possibility of relevant prior work in adjacent optimization or neural network reparametrization domains not captured here.

Taxonomy

Core-task Taxonomy Papers
4
3
Claimed Contributions
3
Contribution Candidate Papers Compared
0
Refutable Paper

Research Landscape Overview

Core task: Reparametrizing differentiable logic gate networks for improved scalability. The field centers on making discrete logical operations amenable to gradient-based learning while maintaining computational efficiency at scale. The taxonomy reveals three main branches: Core Parametrization and Training Efficiency focuses on how individual gates are represented and optimized to ensure stable gradients during backpropagation; Architectural Scaling and Interconnect Design addresses the challenge of composing many gates into larger networks without prohibitive memory or compute costs; and Hybrid Differentiable Logic Systems explores integration with probabilistic or neural components to leverage complementary strengths. Representative works such as Learning with differentiable algorithms[2] and Soft-unification in deep probabilistic[3] illustrate early efforts to bridge symbolic reasoning with continuous optimization, while more recent studies like Scalable Interconnect Learning in[1] and DEFT[4] tackle the practical demands of deploying these ideas in larger architectures. A central tension across these branches is the trade-off between expressiveness and gradient stability: richer gate parameterizations can capture complex logic but may suffer from vanishing or exploding gradients, whereas simpler formulations scale more reliably but limit representational power. Light Differentiable Logic Gate[0] sits squarely within the Gate-Level Reparametrization for Gradient Stability cluster, emphasizing a lightweight reparametrization strategy that prioritizes training stability over architectural complexity. This contrasts with approaches like DEFT[4], which may incorporate more elaborate hybrid mechanisms, and Scalable Interconnect Learning in[1], which focuses on connectivity patterns rather than gate-level numerics. By concentrating on how individual gates are parameterized, Light Differentiable Logic Gate[0] complements broader architectural innovations and offers a foundational building block for scaling differentiable logic networks without sacrificing gradient flow.

Claimed Contributions

Input-wise parametrization (IWP) of logic gate neurons

The authors introduce a new parametrization for differentiable logic gate neurons that reduces parameters from 2^(2^n) to 2^n for n inputs. This reparametrization eliminates redundancies causing vanishing gradients and discretization errors while maintaining expressivity.

3 retrieved papers
Analysis of gradient instability root causes in DLGNs

The authors demonstrate that sign-symmetric redundancies in the original parametrization cause self-cancellations in partial derivatives, leading to vanishing gradients. They show how independent weights for negated gate pairs create destructive interference in gradient signals.

0 retrieved papers
Characterization of effective initialization schemes for deep logic gate networks

The authors formalize residual initializations as part of a broader class of negation-asymmetric heavy-tail initialization schemes and explain why such initializations are beneficial for information flow in both forward and backward passes during training of deep logic gate networks.

0 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

Within the taxonomy built over the current TopK core-task papers, the original paper is assigned to a leaf with no direct siblings and no cousin branches under the same grandparent topic. In this retrieved landscape, it appears structurally isolated, which is one partial signal of novelty, but still constrained by search coverage and taxonomy granularity.

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

Input-wise parametrization (IWP) of logic gate neurons

The authors introduce a new parametrization for differentiable logic gate neurons that reduces parameters from 2^(2^n) to 2^n for n inputs. This reparametrization eliminates redundancies causing vanishing gradients and discretization errors while maintaining expressivity.

Contribution

Analysis of gradient instability root causes in DLGNs

The authors demonstrate that sign-symmetric redundancies in the original parametrization cause self-cancellations in partial derivatives, leading to vanishing gradients. They show how independent weights for negated gate pairs create destructive interference in gradient signals.

Contribution

Characterization of effective initialization schemes for deep logic gate networks

The authors formalize residual initializations as part of a broader class of negation-asymmetric heavy-tail initialization schemes and explain why such initializations are beneficial for information flow in both forward and backward passes during training of deep logic gate networks.

Light Differentiable Logic Gate Networks | Novelty Validation