AutoEP: LLMs-Driven Automation of Hyperparameter Evolution for Metaheuristic Algorithms

ICLR 2026 Conference SubmissionAnonymous Authors
LLMsOptimizationMetaheuristic algorithmAutomatic Algorithm Design
Abstract:

Dynamically configuring algorithm hyperparameters is a fundamental challenge in computational intelligence. While learning-based methods offer automation, they suffer from prohibitive sample complexity and poor generalization. We introduce AutoEP, a novel framework that bypasses training entirely by leveraging Large Language Models (LLMs) as zero-shot reasoning engines for algorithm control. AutoEP's core innovation lies in a tight synergy between two components: (1) an online Exploratory Landscape Analysis (ELA) module that provides real-time, quantitative feedback on the search dynamics, and (2) a multi-LLM reasoning chain that interprets this feedback to generate adaptive hyperparameter strategies. This approach grounds high-level reasoning in empirical data, mitigating hallucination. Evaluated on three distinct metaheuristics across diverse combinatorial optimization benchmarks, AutoEP consistently outperforms state-of-the-art tuners, including neural evolution and other LLM-based methods. Notably, our framework enables open-source models like Qwen3-30B to match the performance of GPT-4, demonstrating a powerful and accessible new paradigm for automated hyperparameter design.Our code is available at https://anonymous.4open.science/r/AutoEP-3E11.

Disclaimer
This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.
NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.
If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper introduces AutoEP, a zero-shot LLM-driven framework for dynamic hyperparameter control in metaheuristic algorithms. It resides in the 'Q-Learning Integration with Metaheuristics' leaf, which contains only three papers total, indicating a relatively sparse research direction within the broader Reinforcement Learning-Based Parameter Adaptation branch. This leaf focuses on Q-learning approaches for operator selection and parameter tuning, contrasting with the six papers in the sibling 'Deep Reinforcement Learning for Hyperparameter Tuning' leaf that employ more complex neural architectures.

The taxonomy reveals that AutoEP's parent branch, Reinforcement Learning-Based Parameter Adaptation, sits alongside Fuzzy Logic-Based Parameter Adaptation (six papers across two leaves) and Analytical Adaptive Control Mechanisms (six papers across two leaves). These neighboring branches represent alternative paradigms: fuzzy systems use expert-defined rules, while analytical methods employ mathematical feedback models. AutoEP diverges by leveraging LLMs as reasoning engines rather than traditional RL training loops, positioning it at the intersection of learning-based adaptation and symbolic reasoning. The taxonomy also shows substantial activity in Metaheuristic-Optimized Hyperparameter Tuning for Machine Learning (nine papers across three leaves), which addresses ML model tuning rather than metaheuristic self-configuration.

Among 30 candidates examined through semantic search, none clearly refute any of AutoEP's three core contributions. The first contribution (zero-shot LLM framework) examined 10 candidates with no refutations; the second (grounding LLM reasoning with real-time Exploratory Landscape Analysis) examined 10 with no refutations; and the third (multi-LLM chain of reasoning) examined 10 with no refutations. This suggests that within the limited search scope, the combination of LLM-driven control, online landscape analysis feedback, and multi-model reasoning chains appears relatively unexplored. However, the search examined only top-30 semantic matches, not the full literature, and the taxonomy shows the Q-learning leaf itself is sparsely populated.

Based on the limited 30-candidate search and the sparse three-paper leaf containing AutoEP, the work appears to occupy a relatively novel position. The taxonomy structure indicates that while RL-based parameter adaptation is an established direction, the specific integration of LLMs for zero-shot reasoning represents a departure from traditional Q-learning or deep RL training paradigms. The analysis cannot assess whether broader literature beyond the top-30 semantic matches contains overlapping work, particularly in emerging LLM-for-optimization research.

Taxonomy

Core-task Taxonomy Papers
50
3
Claimed Contributions
30
Contribution Candidate Papers Compared
0
Refutable Paper

Research Landscape Overview

Core task: dynamic hyperparameter configuration for metaheuristic algorithms. The field addresses how to automatically adjust the control parameters of metaheuristic optimization methods during search, rather than fixing them in advance. The taxonomy reveals several major branches: Reinforcement Learning-Based Parameter Adaptation leverages Q-learning and related techniques to learn optimal parameter schedules online, as surveyed in Q-learning Metaheuristics Survey[1]; Fuzzy Logic-Based Parameter Adaptation employs fuzzy inference systems to modulate parameters based on search state indicators; Analytical Adaptive Control Mechanisms derive parameter updates from mathematical models of algorithm behavior; Metaheuristic-Optimized Hyperparameter Tuning for Machine Learning applies metaheuristics to tune hyperparameters of neural networks and other ML models; Frameworks and Toolkits provide reusable software infrastructure such as MetaGen[2]; Novel Metaheuristic Algorithm Design introduces entirely new nature-inspired or bio-inspired algorithms with built-in adaptive mechanisms; Hybrid and Multi-Level Metaheuristic Architectures combine multiple metaheuristics or operate at different abstraction levels; Domain-Specific Applications demonstrate adaptive metaheuristics in areas ranging from energy systems to medical diagnosis; and Theoretical Foundations and Comparative Studies offer rigorous analysis and benchmarking across methods. Within Reinforcement Learning-Based Parameter Adaptation, a dense cluster of works integrates Q-learning with classic metaheuristics to enable agents to select parameter values or operator choices based on accumulated reward signals. AutoEP[0] sits squarely in this Q-Learning Integration with Metaheuristics subgroup, alongside Adaptive Hyperheuristic[3], which also explores learning-driven parameter control. These approaches contrast with the Fuzzy Logic branch, where parameter adjustments rely on expert-defined membership functions rather than trial-and-error learning, and with Analytical Adaptive Control, which uses closed-form update rules. A key trade-off is between the sample efficiency and interpretability of analytical or fuzzy methods versus the flexibility and potential for discovery offered by reinforcement learning. AutoEP[0] emphasizes automated, data-driven tuning through Q-learning, distinguishing it from Adaptive Hyperheuristic[3] by its specific focus on evolutionary parameter schedules, while both share the goal of reducing manual parameter calibration and improving robustness across diverse problem instances.

Claimed Contributions

Zero-shot LLM-driven framework for hyperparameter control

The authors introduce AutoEP, a framework that uses Large Language Models as zero-shot reasoning engines to automatically configure metaheuristic algorithm hyperparameters without requiring any training phase. This approach is designed to be applicable to any metaheuristic algorithm as a plug-and-play module.

10 retrieved papers
Grounding LLM reasoning with real-time Exploratory Landscape Analysis

The framework incorporates an online Exploratory Landscape Analysis module that continuously provides quantitative metrics about the optimization state to the LLM. This grounding mechanism anchors the model's abstract reasoning in observable search dynamics, reducing hallucination and enabling data-driven hyperparameter decisions.

10 retrieved papers
Multi-LLM Chain of Reasoning for complex control tasks

The authors develop a Chain of Reasoning architecture that decomposes the hyperparameter control task into specialized reasoning steps handled by multiple collaborating LLMs. This design enables open-source models to achieve performance comparable to large proprietary models while maintaining lower inference latency.

10 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

Zero-shot LLM-driven framework for hyperparameter control

The authors introduce AutoEP, a framework that uses Large Language Models as zero-shot reasoning engines to automatically configure metaheuristic algorithm hyperparameters without requiring any training phase. This approach is designed to be applicable to any metaheuristic algorithm as a plug-and-play module.

Contribution

Grounding LLM reasoning with real-time Exploratory Landscape Analysis

The framework incorporates an online Exploratory Landscape Analysis module that continuously provides quantitative metrics about the optimization state to the LLM. This grounding mechanism anchors the model's abstract reasoning in observable search dynamics, reducing hallucination and enabling data-driven hyperparameter decisions.

Contribution

Multi-LLM Chain of Reasoning for complex control tasks

The authors develop a Chain of Reasoning architecture that decomposes the hyperparameter control task into specialized reasoning steps handled by multiple collaborating LLMs. This design enables open-source models to achieve performance comparable to large proprietary models while maintaining lower inference latency.