Automata Learning and Identification of the Support of Language Models

ICLR 2026 Conference SubmissionAnonymous Authors
automata learningregular languageslearning theoryDFA extractionlanguage models
Abstract:

We study the learnability of languages in the Next Symbol Prediction (NSP) setting, where a learner receives only positive examples from a language together with, for every prefix, (i) whether the prefix itself is in the language and (ii) which next symbols can lead to an accepting string. This setting has been used in prior work to empirically analyze neural sequence models, and additionally, we observe that efficient algorithms for the NSP setting can be used to learn the (truncated) support of language models. We first show that the class of DFAs with at most nn states is identifiable from positive examples augmented with these NSP labels. Nevertheless, even with this richer supervision, we show that PAC-learning DFAs remains computationally hard, and exact identification using only membership queries cannot be achieved in polynomial time. We then present Lnsp\mathrm{L_{nsp}^{\star}}, an extension of Angluin’s L\mathrm{L}^{\star} algorithm, and show that DFAs can be PAC-learned efficiently using a language-model–based teacher that answers membership queries and generates valid strings conditioned on prefix prompts. Finally, we conduct a comprehensive experimental evaluation on 11 regular languages of varying complexity. Using Lnsp\mathrm{L}^{\star}_{\text{nsp}}, we extract DFAs from Transformer-based language models trained on regular languages to evaluate the algorithm’s effectiveness and identify erroneous examples.

Disclaimer
This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.
NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.
If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper contributes formal learnability results for DFAs in the Next Symbol Prediction setting, establishing identifiability from positive examples with NSP labels and proving computational hardness for PAC-learning. It resides in the 'PAC-Learnability and Identifiability Results' leaf alongside two sibling papers within the 'Theoretical Foundations and Complexity Analysis' branch. This leaf represents a focused but not overcrowded research direction, with only three papers addressing formal sample and time complexity guarantees for NSP-based DFA learning, suggesting the work enters a relatively sparse theoretical niche.

The taxonomy reveals neighboring research directions that contextualize this contribution. The sibling leaf 'Average-Case and Probabilistic Learning Models' explores distributional assumptions rather than worst-case analysis, while the broader 'Algorithmic Approaches for DFA Extraction' branch contains query-based and passive learning methods that implement extraction procedures. The paper bridges theoretical complexity analysis with algorithmic design through its L-star extension, connecting the 'Theoretical Foundations' branch to practical 'Query-Based Learning Algorithms'. The taxonomy's scope and exclude notes clarify that this work focuses on formal guarantees rather than empirical extraction heuristics or neural architecture analysis.

Among 25 candidates examined across three contributions, the identifiability result shows one refutable candidate from nine examined, indicating some prior overlap in this specific theoretical direction. The computational hardness results examined ten candidates with zero refutations, suggesting this contribution may occupy less-explored territory within NSP complexity theory. The L-star-nsp algorithm examined six candidates without refutation, though the limited search scope means undiscovered related query-based methods may exist. The statistics reflect a targeted literature search rather than exhaustive coverage, with contribution-level examination ranging from six to ten papers per claim.

Based on top-25 semantic matches plus citation expansion, the work appears to make substantive theoretical contributions, particularly in hardness characterization and algorithmic design for NSP settings. The identifiability result encounters some prior work overlap within the examined scope, while the hardness and algorithmic contributions show less direct precedent among candidates reviewed. The analysis covers formal learnability questions but does not exhaustively survey all automata extraction methods or neural analysis techniques in adjacent taxonomy branches.

Taxonomy

Core-task Taxonomy Papers
15
3
Claimed Contributions
25
Contribution Candidate Papers Compared
1
Refutable Paper

Research Landscape Overview

Core task: learning DFAs from next symbol prediction labels. This field explores how to extract interpretable finite automata from models trained to predict the next symbol in a sequence, bridging neural learning and symbolic representation. The taxonomy organizes work into four main branches: Theoretical Foundations and Complexity Analysis examines the fundamental limits and sample complexity of learning automata from prediction oracles, including PAC-learnability results and hardness characterizations (e.g., Hardness Regular Languages[2], Next Symbol PAC[6]); Algorithmic Approaches for DFA Extraction develops concrete methods for constructing automata from trained predictors, often leveraging state merging or spectral techniques (e.g., Spectral Automata Distillation[3], DFAs Sparse Data[9]); Neural Network Analysis and Automata Extraction investigates how neural architectures encode regular structure and how to distill symbolic rules from them (e.g., Rule Extraction RNN[10], Next Token Mechanics[11]); and Application Domains and Specialized Extensions applies these ideas to tasks like temporal logic synthesis, complex event processing, and world model evaluation (e.g., Neurosymbolic Temporal Logic[7], Automata Complex Events[8], Evaluating World Model[1]). A central tension runs through the field: while next-symbol prediction is a natural and widely used training signal, theoretical work reveals that learning exact DFAs from such labels can be computationally hard or require exponentially many queries in certain settings, as highlighted by Hardness Regular Languages[2]. Practical extraction algorithms must therefore balance expressiveness, sample efficiency, and computational tractability, often employing heuristics or restricting to subclasses of automata. Automata Learning Support[0] sits squarely within the Theoretical Foundations branch, focusing on PAC-learnability and identifiability results. Its emphasis on formal guarantees places it alongside Next Symbol PAC[6], which also investigates sample complexity bounds, though Automata Learning Support[0] may explore different oracle models or tighter characterizations. This contrasts with more algorithm-oriented neighbors like Spectral Automata Distillation[3], which prioritizes practical extraction methods over worst-case complexity, illustrating the field's dual commitment to rigorous theory and deployable techniques.

Claimed Contributions

Identifiability of DFAs from positive NSP-labeled examples

The authors prove that positive examples with Next Symbol Prediction labels are information-theoretically sufficient to uniquely identify minimal DFAs. They show that distinct minimal DFAs always disagree on NSP labeling of some positive string, establishing finite teaching sets and well-defined equivalence oracles.

9 retrieved papers
Can Refute
Computational hardness results for NSP learning

The authors demonstrate that despite richer NSP supervision, efficient PAC-learning of DFAs remains computationally intractable under cryptographic assumptions. They prove this through a reduction showing that NSP learning of acyclic DFAs is as hard as conventional binary classification learning.

10 retrieved papers
L-star-nsp algorithm for learning with language model teachers

The authors develop L-star-nsp, an extension of the classical L-star algorithm that uses membership and generative queries to extract DFAs from language models. The algorithm provides distribution-specific PAC guarantees with respect to the teacher's distribution and can identify the truncated support of language models.

6 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

Identifiability of DFAs from positive NSP-labeled examples

The authors prove that positive examples with Next Symbol Prediction labels are information-theoretically sufficient to uniquely identify minimal DFAs. They show that distinct minimal DFAs always disagree on NSP labeling of some positive string, establishing finite teaching sets and well-defined equivalence oracles.

Contribution

Computational hardness results for NSP learning

The authors demonstrate that despite richer NSP supervision, efficient PAC-learning of DFAs remains computationally intractable under cryptographic assumptions. They prove this through a reduction showing that NSP learning of acyclic DFAs is as hard as conventional binary classification learning.

Contribution

L-star-nsp algorithm for learning with language model teachers

The authors develop L-star-nsp, an extension of the classical L-star algorithm that uses membership and generative queries to extract DFAs from language models. The algorithm provides distribution-specific PAC guarantees with respect to the teacher's distribution and can identify the truncated support of language models.

Automata Learning and Identification of the Support of Language Models | Novelty Validation