The Counting Power of Transformers
Overview
Overall Novelty Assessment
The paper establishes that transformers can express semialgebraic counting properties—those defined by boolean combinations of arbitrary multivariate polynomials—and provides a formal framework for analyzing counting power. It resides in the Logic-Based Expressiveness Characterization leaf, which contains five papers total, including the original work. This leaf sits within the broader Theoretical Expressiveness and Formal Characterization branch, indicating a moderately populated research direction focused on formal proofs rather than empirical validation. The sibling papers similarly use logical or formal language frameworks to characterize transformer expressiveness, suggesting this is an active but not overcrowded niche.
The taxonomy reveals that theoretical expressiveness research divides into logic-based characterizations, complexity-theoretic bounds, and positional encoding analyses. The original paper's leaf neighbors the Complexity-Theoretic Bounds leaf (four papers) and the Positional Encoding leaf (three papers), both addressing complementary aspects of transformer power. The taxonomy's scope_note clarifies that logic-based work establishes equivalences with formal systems, while complexity studies prove computational limits—this paper bridges both by using logical frameworks to demonstrate nonlinear counting capacity. The broader Empirical Analysis branch (seven papers across three leaves) explores whether these theoretical capabilities manifest in practice, highlighting a field-wide gap between formal expressiveness and learned behavior.
Among twenty candidates examined, Contribution A (formal framework) showed no clear refutation across ten candidates, suggesting novelty in how the framework is structured. Contribution B (semialgebraic properties) examined four candidates and found one refutable match, indicating some prior work addresses nonlinear counting expressiveness, though the scale of overlap remains unclear from this limited search. Contribution C (uniform average hard attention characterization) examined six candidates with one refutable match, suggesting partial overlap in architectural characterization approaches. The analysis explicitly covers top-K semantic matches plus citation expansion, not an exhaustive survey of all transformer expressiveness literature.
Given the limited search scope of twenty candidates, the paper appears to advance the logic-based expressiveness agenda by moving beyond semilinear properties to semialgebraic ones, though at least two contributions show some prior work overlap. The taxonomy context suggests this work extends an established research direction rather than opening an entirely new area. A broader literature search might reveal additional overlapping results, particularly in the complexity-theoretic or empirical branches not fully covered here.
Taxonomy
Research Landscape Overview
Claimed Contributions
The authors introduce a formal framework to study the expressiveness of transformers for counting properties, which are permutation-closed languages where acceptance depends only on the number of occurrences of tokens rather than their order.
The main theoretical result demonstrates that transformers can express counting properties defined by boolean combinations of polynomial inequalities of arbitrary degree, generalizing beyond the linear counting properties captured by existing models like C-RASP.
The authors prove that NoPE-AHAT and NoPE-AHAT[U] (transformers without positional encodings using average hard attention with uniform layers) precisely capture semialgebraic counting properties, providing a complete characterization of this expressiveness class.
Core Task Comparisons
Comparisons with papers in the same taxonomy category
[5] Counting like transformers: Compiling temporal counting logic into softmax transformers PDF
[16] Logical languages accepted by transformer encoders with hard attention PDF
[23] Tighter bounds on the expressivity of transformer encoders PDF
[49] Expressive Power of Graph Transformers via Logic PDF
Contribution Analysis
Detailed comparisons for each claimed contribution
Formal framework for counting power of transformers
The authors introduce a formal framework to study the expressiveness of transformers for counting properties, which are permutation-closed languages where acceptance depends only on the number of occurrences of tokens rather than their order.
[1] When Can Transformers Count to n? PDF
[2] Nope: the counting power of transformers with no positional encodings PDF
[5] Counting like transformers: Compiling temporal counting logic into softmax transformers PDF
[8] Language models need inductive biases to count inductively PDF
[13] Contextual Counting: A Mechanistic Study of Transformers on a Quantitative Task PDF
[23] Tighter bounds on the expressivity of transformer encoders PDF
[54] Dynamic unary convolution in transformers PDF
[55] Homomorphism Counts as Structural Encodings for Graph Learning PDF
[56] Trees in transformers: a theoretical analysis of the Transformer's ability to represent trees PDF
[57] Counting ability of large language models and impact of tokenization PDF
Transformers capture semialgebraic counting properties
The main theoretical result demonstrates that transformers can express counting properties defined by boolean combinations of polynomial inequalities of arbitrary degree, generalizing beyond the linear counting properties captured by existing models like C-RASP.
[2] Nope: the counting power of transformers with no positional encodings PDF
[58] Transformers in Uniform TC PDF
[59] BOLD: Boolean Logic Deep Learning PDF
[60] Solving Mathematical Problems with Transformers PDF
Characterization via uniform average hard attention transformers
The authors prove that NoPE-AHAT and NoPE-AHAT[U] (transformers without positional encodings using average hard attention with uniform layers) precisely capture semialgebraic counting properties, providing a complete characterization of this expressiveness class.