LearNAT: Learning NL2SQL with AST-guided Task Decomposition for Large Language Models

ICLR 2026 Conference SubmissionAnonymous Authors

OpenReview Score: 6.0 Download Report PDF

Large Language ModelText-to-SQL

Natural Language to SQL (NL2SQL) aims to translate natural language queries into executable SQL statements, offering non-expert users intuitive access to databases. While recent approaches leveraging large-scale private LLMs such as GPT-4 have achieved state-of-the-art results, they face two critical challenges: the lack of openness and reproducibility, and the prohibitive computational cost of test-time scaling. To address these issues, we explore improving the model-level performance of small-scale public LLMs in NL2SQL under resource-constrained settings. Our exploratory experiments reveal the potential of task decomposition for enhancing NL2SQL performance, but also highlight the difficulty of enabling LLMs to decompose queries effectively. Motivated by these findings, we propose LearNAT, a novel framework designed to enhance LLMs’ decomposition capabilities. LearNAT introduces (1) a Decomposition Synthesis Procedure, which leverages AST-guided search with pruning strategies to generate verifiable and efficient decompositions, and (2) Margin-Aware Reinforcement Learning, which provides fine-grained preference optimization for multi-step reasoning beyond standard DPO. Extensive experiments on benchmark datasets demonstrate that LearNAT significantly improves the performance of small-scale LLMs, achieving results comparable to GPT-4 with only a 7B parameter model. These results validate the effectiveness of verifiable decomposition and fine-grained preference learning in advancing NL2SQL towards openness, transparency, and efficiency. Our code is publicly available at https://anonymous.4open.science/r/LearNAT.

Abstract:

Disclaimer

This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.

NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.

If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper proposes LearNAT, a framework combining task decomposition with reinforcement learning to improve small-scale open-source LLMs for NL2SQL translation. It resides in the 'Reinforcement Learning and Preference Optimization' leaf under 'Model Architectures and Training Paradigms', which contains only two papers total. This represents a relatively sparse research direction within the broader taxonomy of fifty papers, suggesting the specific combination of RL-based training and decomposition-guided SQL generation remains underexplored compared to prompting-based or supervised fine-tuning approaches.

The taxonomy reveals neighboring work in 'Large Language Model Fine-Tuning and Adaptation' (five papers on supervised/preference learning) and 'Decomposition and Multi-Step Reasoning' (three papers on chain-of-thought prompting). LearNAT bridges these directions by applying RL to decomposition rather than relying on prompting alone. The 'Prompting and In-Context Learning' branch contains methods like SQL-R1 that achieve decomposition through iterative prompting without model training, highlighting a methodological divide between training-based and inference-time approaches to multi-step reasoning in NL2SQL.

Among sixteen candidates examined, the core LearNAT framework shows one refutable candidate from ten examined, while the AST-guided decomposition synthesis examined only one candidate with no clear refutation. The margin-aware RL component examined five candidates with one refutable match. The limited search scope (top-K semantic retrieval plus citations) means these statistics reflect a focused sample rather than exhaustive coverage. The decomposition synthesis procedure appears less contested in the examined literature, while the overall framework and RL training approach encounter more substantial prior work within this constrained search.

Based on the limited sixteen-candidate search, the work appears to occupy a moderately explored intersection of decomposition and RL-based training. The taxonomy structure suggests this combination is less crowded than pure prompting or supervised fine-tuning directions, though the small sibling set (one other paper) may reflect taxonomy granularity rather than absolute novelty. A broader literature search would be needed to assess whether similar decomposition-RL hybrids exist beyond the examined candidates.

Taxonomy

Core-task Taxonomy Papers

Claimed Contributions

Contribution Candidate Papers Compared

Refutable Paper

Research Landscape Overview

Core task: translating natural language queries into executable SQL statements. The field has evolved from foundational sequence-to-sequence methods like Seq2SQL[23] to a rich ecosystem organized around several key challenges. Model Architectures and Training Paradigms explore how neural networks and large language models can be trained or fine-tuned for SQL generation, including reinforcement learning and preference optimization techniques. Prompting and In-Context Learning with LLMs investigate how to leverage pre-trained models through careful prompt design and few-shot examples, while Schema Understanding and Linking address the critical problem of mapping natural language mentions to database elements. Query Refinement and Error Correction focus on iterative improvement and debugging of generated queries, and Evaluation Frameworks and Benchmarking provide standardized testbeds for measuring progress. Domain-Specific and Cross-Domain Applications examine how methods generalize across different database schemas and specialized domains, complemented by System Implementation and Deployment work that bridges research prototypes and production systems. Surveys and Literature Reviews synthesize progress across these dimensions, while Code Generation and Execution-Based Methods emphasize verifying correctness through actual query execution. Recent work has increasingly turned to reinforcement learning and preference-based training to refine SQL generation beyond supervised learning alone. LearNAT[0] exemplifies this trend by incorporating reinforcement learning to optimize translation quality, positioning itself within a small but growing cluster of methods that use execution feedback and human preferences to guide model training. This approach contrasts with purely prompt-based strategies like those in SQL-R1[5], which relies on in-context learning and iterative refinement without explicit reward-based training. The central trade-off involves whether to invest in specialized training procedures that adapt models to SQL-specific objectives or to exploit the general reasoning capabilities of large pre-trained models through clever prompting. Open questions remain about how much domain-specific training is necessary when foundation models continue to improve, and whether hybrid approaches combining both paradigms will dominate future systems.

Claimed Contributions

LearNAT framework for NL2SQL via task decomposition

Can Refute

10 retrieved papers

The authors introduce LearNAT, the first framework to improve LLM performance on NL2SQL tasks by explicitly leveraging task decomposition. This framework addresses the challenge of enabling LLMs to comprehend users' high-level semantics and map them to database schemas for complex NL2SQL queries.

10 retrieved papers

Can Refute

Decomposition Synthesis Procedure with AST-guided search

1 retrieved paper

A novel procedure that leverages abstract syntax tree (AST)-guided search with pruning strategies to generate verifiable and efficient decompositions. This component uses AST-based validation to ensure correctness of generated subtasks and employs pruning to improve search efficiency.

1 retrieved paper

Margin-Aware Reinforcement Learning for preference optimization

Can Refute

5 retrieved papers

A reinforcement learning framework that enables fine-grained preference learning tailored to multi-step reasoning. It introduces an AST-based margin-aware DPO algorithm that differentiates between varying levels of step correctness, providing more precise optimization than standard Direct Preference Optimization.

5 retrieved papers

Can Refute

Core Task Comparisons

Comparisons with papers in the same taxonomy category

[5] SQL-R1: Training Natural Language to SQL Reasoning Model By Reinforcement Learning PDF

Peixian Ma (2025)

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

LearNAT framework for NL2SQL via task decomposition

[15] DIN-SQL: Decomposed In-Context Learning of Text-to-SQL with Self-Correction PDF

Can Refute

[8] Nl2kql: From natural language to kusto query PDF

Cannot Refute

[13] Natural language to SQL: Where are we today? PDF

Cannot Refute

[23] Seq2sql: Generating structured queries from natural language using reinforcement learning PDF

Cannot Refute

[33] Cogsql: A cognitive framework for enhancing large language models in text-to-sql translation PDF

Cannot Refute

[57] Metasql: A Generate-Then-Rank Framework for Natural Language to SQL Translation PDF

Cannot Refute

[58] Dts-sql: Decomposed text-to-sql with small large language models PDF

Cannot Refute

[59] MAC-SQL: A Multi-Agent Collaborative Framework for Text-to-SQL PDF

Cannot Refute

[60] Structure-Guided Large Language Models for Text-to-SQL Generation PDF

Cannot Refute

[61] Automating pharmacovigilance evidence generation: using large language models to produce context-aware structured query language PDF

Cannot Refute

Contribution

Decomposition Synthesis Procedure with AST-guided search

[51] Syntaxsqlnet: Syntax tree networks for complex and cross-domaintext-to-sql task PDF

Cannot Refute

Contribution

Margin-Aware Reinforcement Learning for preference optimization

[52] TPO: Aligning large language models with multi-branch & multi-step preference trees PDF

Can Refute

[53] Thinkswitcher: When to think hard, when to think fast PDF

Cannot Refute

[54] Scaling Full-Stack Safety for Learning-Enabled Robot Autonomy PDF

Cannot Refute

[55] Advice Guided Reinforcement Learning PDF

Cannot Refute

[56] Player-Coach Teamwork: Multi-agent Collaboration for Improving LLM Reasoning PDF

Cannot Refute

LearNAT: Learning NL2SQL with AST-guided Task Decomposition for Large Language Models

Overview

Overall Novelty Assessment

Taxonomy

Research Landscape Overview

Claimed Contributions

Core Task Comparisons

[5] SQL-R1: Training Natural Language to SQL Reasoning Model By Reinforcement Learning PDF

Contribution Analysis

LearNAT framework for NL2SQL via task decomposition

[15] DIN-SQL: Decomposed In-Context Learning of Text-to-SQL with Self-Correction PDF

[8] Nl2kql: From natural language to kusto query PDF

[13] Natural language to SQL: Where are we today? PDF

[23] Seq2sql: Generating structured queries from natural language using reinforcement learning PDF

[33] Cogsql: A cognitive framework for enhancing large language models in text-to-sql translation PDF

[57] Metasql: A Generate-Then-Rank Framework for Natural Language to SQL Translation PDF

[58] Dts-sql: Decomposed text-to-sql with small large language models PDF

[59] MAC-SQL: A Multi-Agent Collaborative Framework for Text-to-SQL PDF

[60] Structure-Guided Large Language Models for Text-to-SQL Generation PDF

[61] Automating pharmacovigilance evidence generation: using large language models to produce context-aware structured query language PDF

Decomposition Synthesis Procedure with AST-guided search

[51] Syntaxsqlnet: Syntax tree networks for complex and cross-domaintext-to-sql task PDF

Margin-Aware Reinforcement Learning for preference optimization

[52] TPO: Aligning large language models with multi-branch & multi-step preference trees PDF

[53] Thinkswitcher: When to think hard, when to think fast PDF

[54] Scaling Full-Stack Safety for Learning-Enabled Robot Autonomy PDF

[55] Advice Guided Reinforcement Learning PDF

[56] Player-Coach Teamwork: Multi-agent Collaboration for Improving LLM Reasoning PDF

Table of Contents