RECAST: Expanding the Boundaries of LLMs' Complex Instruction Following with Multi-Constraint Data

ICLR 2026 Conference SubmissionAnonymous Authors
LLMComplex Instruction FollowingData synthesisReinforement Learning
Abstract:

Large language models (LLMs) are increasingly expected to tackle complex tasks, driven by their expanding applications and users' growing proficiency in crafting sophisticated prompts. However, as the number of explicitly stated requirements increases (particularly more than 1010 constraints), LLMs often struggle to accurately follow such complex instructions, which limits their applicability in complex real-world scenarios. To the best of our knowledge, existing datasets do not exceed 10 constraints per instance. To address this challenge, we propose RECAST, an efficient and scalable framework for synthesizing datasets where each example incorporates far more constraints than those in existing benchmarks, aiming to challenge and extend the boundaries of models’ ability to follow complex instructions. These constraints are extracted from real-world prompt-response pairs to ensure practical relevance. Using this framework, we construct RECAST-3030K, a large-scale, high-quality dataset comprising 3030k instances spanning 1919 constraint types. Experimental results demonstrate that models fine-tuned on RECAST-30K substantially improve in following complex instructions while maintaining their general capabilities without degradation. Moreover, RECAST enables automatic verification of constraint satisfaction via rule-based validators for quantitative constraints and LLM-based validators for qualitative ones, the verifiability provided by RECAST enables the design of reward functions for reinforcement learning, which further boosts model performance on complex and challenging tasks.

Disclaimer
This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.
NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.
If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper introduces RECAST, a framework for synthesizing instruction-following datasets with far more constraints per instance than existing benchmarks (exceeding 10 constraints, targeting practical scenarios with 19 constraint types). It resides in the 'Constraint-Based Data Generation' leaf under 'Training Data Construction and Synthesis', alongside three sibling papers: Constraint Back-translation, RECAST Verifiable, and Conifer. This leaf represents a focused but not overcrowded research direction within the broader taxonomy of 50 papers across 22 leaf nodes, indicating moderate activity in constraint-based data synthesis methods.

The taxonomy reveals neighboring leaves such as 'Iterative and Refinement-Based Generation' (automated complexity expansion) and 'Long-Form and Multi-Constraint Datasets' (specialized corpora for extended text). RECAST's approach diverges from iterative refinement methods by emphasizing static constraint extraction from real-world prompt-response pairs, aligning more closely with verification-oriented synthesis. The broader 'Training Data Construction and Synthesis' branch contains six papers total, suggesting this is an active but not saturated area. Related branches like 'Evaluation Benchmarks and Metrics' (14 papers) and 'Training Methods and Optimization' (7 papers) indicate the field prioritizes assessment and optimization alongside data generation.

Among 30 candidates examined, the RECAST framework (Contribution 1) shows one refutable candidate out of 10 examined, suggesting some overlap with prior constraint-based synthesis work. The RECAST-30K dataset (Contribution 2) and RLVC training method (Contribution 3) each examined 10 candidates with zero refutations, indicating these contributions appear more distinct within the limited search scope. The framework's emphasis on scaling beyond 10 constraints per instance and extracting constraints from real-world data may differentiate it from existing synthesis pipelines, though the single refutable match warrants attention to prior constraint extraction techniques.

Based on the top-30 semantic matches examined, the work appears to occupy a moderately explored niche within constraint-based data generation. The taxonomy structure suggests the field is actively developing evaluation benchmarks and training methods, but data synthesis approaches remain less saturated. The analysis does not cover exhaustive literature review or domain-specific constraint generation methods outside the examined candidates, leaving open questions about overlap with specialized constraint frameworks or industry-scale synthesis pipelines not captured in this search scope.

Taxonomy

Core-task Taxonomy Papers
50
3
Claimed Contributions
30
Contribution Candidate Papers Compared
1
Refutable Paper

Research Landscape Overview

Core task: complex instruction following with multiple constraints. This field addresses how language models can satisfy diverse, simultaneous requirements—ranging from format and length restrictions to content and stylistic rules—within a single response. The taxonomy organizes research into several major branches: Training Data Construction and Synthesis focuses on generating high-quality constraint-rich examples, often through automated pipelines or back-translation techniques (e.g., Constraint Back-translation[3], Conifer[7]); Training Methods and Optimization explores curriculum strategies and specialized loss functions; Evaluation Benchmarks and Metrics develops testbeds that measure adherence to multi-dimensional constraints (e.g., FollowBench[17], CFBench[23]); Inference-Time Strategies and Self-Correction examines runtime verification and iterative refinement; and additional branches cover mechanistic analysis, generalization, domain-specific applications, and multi-task architectures. Together, these branches reflect a maturing effort to move beyond single-objective instruction tuning toward systems that juggle competing or layered requirements. Within the data construction landscape, a handful of works emphasize constraint-based generation to produce training corpora that stress-test models on verifiable rules. RECAST[0] sits squarely in this cluster, proposing methods to synthesize instruction–response pairs where constraints are explicit and checkable, much like RECAST Verifiable[6] and Conifer[7]. Compared to Constraint Back-translation[3], which reverses the generation process to ensure coverage, RECAST[0] prioritizes scalable synthesis with built-in verification hooks. Meanwhile, evaluation-focused efforts such as LIFEBench[4] and Multi-Dimensional Constraint[5] highlight the need for rigorous benchmarks that go beyond surface-level metrics. Open questions remain around balancing constraint diversity with data efficiency, and whether models trained on synthetic constraint-heavy data generalize to real-world multi-constraint scenarios. RECAST[0] contributes to this ongoing dialogue by offering a framework that bridges data synthesis and verifiability, positioning itself as a practical tool for scaling constraint-aware instruction tuning.

Claimed Contributions

RECAST framework for synthesizing multi-constraint instruction-following datasets

The authors introduce RECAST, a data-synthesis framework that systematically mines and verifies both rule-based and model-based constraints from existing prompt-response pairs. This framework enables the construction of instruction-following datasets with unprecedented constraint complexity, addressing limitations in existing benchmarks that typically contain fewer than 10 constraints per instance.

10 retrieved papers
Can Refute
RECAST-30K dataset with 30k instances spanning 19 constraint types

The authors release RECAST-30K, a large-scale dataset constructed using the RECAST framework. This dataset contains 30,000 training instances with diverse, verifiable constraints across 19 types, designed to benchmark and improve complex instruction-following performance in language models.

10 retrieved papers
RLVC reinforcement learning method using constraint-specific rewards

The authors propose RLVC (Reinforcement Learning with Verifiable Constraints), which exploits the verifiable nature of constraints in RECAST-30K to provide fine-grained, per-constraint reward signals during policy optimization. This method treats each constraint as a separate optimization target, enabling more effective learning for complex multi-constraint scenarios.

10 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

RECAST framework for synthesizing multi-constraint instruction-following datasets

The authors introduce RECAST, a data-synthesis framework that systematically mines and verifies both rule-based and model-based constraints from existing prompt-response pairs. This framework enables the construction of instruction-following datasets with unprecedented constraint complexity, addressing limitations in existing benchmarks that typically contain fewer than 10 constraints per instance.

Contribution

RECAST-30K dataset with 30k instances spanning 19 constraint types

The authors release RECAST-30K, a large-scale dataset constructed using the RECAST framework. This dataset contains 30,000 training instances with diverse, verifiable constraints across 19 types, designed to benchmark and improve complex instruction-following performance in language models.

Contribution

RLVC reinforcement learning method using constraint-specific rewards

The authors propose RLVC (Reinforcement Learning with Verifiable Constraints), which exploits the verifiable nature of constraints in RECAST-30K to provide fine-grained, per-constraint reward signals during policy optimization. This method treats each constraint as a separate optimization target, enabling more effective learning for complex multi-constraint scenarios.