Trinity: An Evolved LLM Coordinator

ICLR 2026 Conference SubmissionAnonymous Authors

OpenReview Score: 5.5 Download Report PDF

evolutionary strategiesmulti-agent LLM systemsrole-based delegationlogits-to-agent mapping

Combining diverse foundation models is promising, but weight-merging is limited by mismatched architectures and closed APIs. Trinity addresses this with a lightweight coordinator that orchestrates collaboration among large language models (LLMs). The coordinator, comprising a compact language model ( $\approx 0.6$ B parameters) and a lightweight head ( $\approx 10$ K parameters), is optimized with an evolutionary strategy for efficient and adaptive delegation. Trinity processes queries over multiple turns, where at each turn the coordinator assigns one of three roles (Thinker, Worker, or Verifier) to a selected LLM, effectively offloading complex skill acquisition from the coordinator itself. Extensive experiments demonstrate that Trinity consistently outperforms individual models and existing methods in various tasks, including coding, math, reasoning, and domain knowledge, while robustly generalizing to out-of-distribution tasks. On established benchmarks, Trinity achieves state-of-the-art performance, including a new record of $86.2\%$ on LiveCodeBench. Theoretical and empirical analyses highlight two key factors driving this success: (1) the coordinator’s hidden-state representations provide rich contextualization of inputs, and (2) under high dimensionality and strict budget constraints, the separable Covariance Matrix Adaptation Evolution Strategy algorithm provides substantial advantages over RL, imitation learning, and random search, leveraging potential block- $\varepsilon$ -separability.

Abstract:

Disclaimer

This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.

NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.

If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

Trinity proposes a lightweight coordinator (approximately 0.6B parameters plus a 10K-parameter head) that orchestrates collaboration among multiple LLMs through dynamic role assignment across Thinker, Worker, and Verifier functions. The paper positions itself within the 'Evolved and Learned Coordination Strategies' leaf of the taxonomy, which currently contains only this single paper as a sibling. This leaf sits under the broader 'LLM Routing and Selection' branch, indicating a relatively sparse research direction focused specifically on adaptive coordination policies learned through evolutionary or reinforcement-based methods rather than static routing heuristics.

The taxonomy reveals that Trinity's approach bridges multiple neighboring research areas. It shares conceptual territory with 'Dynamic and Adaptive Orchestration' under centralized architectures, which includes systems that adjust strategies at runtime, and with 'Hierarchical and Role-Based Coordination', which emphasizes explicit role differentiation. However, Trinity diverges by using evolutionary strategies for policy optimization rather than predefined workflows or hierarchical control structures. The 'Query-Specific Model Selection and Routing' leaf contains methods for dynamic LLM selection, but these typically lack the multi-turn, role-based coordination protocol that Trinity employs. This positioning suggests Trinity occupies a niche intersection between adaptive routing and structured multi-agent collaboration.

Among the thirty candidates examined across three contributions, none were identified as clearly refuting Trinity's core claims. The 'Lightweight coordinator for LLM orchestration' contribution examined ten candidates with zero refutable overlaps, as did the 'Tri-role coordination protocol' and 'Evolutionary strategy training methodology' contributions. This absence of refutation reflects the limited search scope rather than definitive novelty: the analysis covers top-K semantic matches and citation expansion, not an exhaustive survey. The tri-role protocol and evolutionary training appear particularly distinctive within this sample, though the small candidate pool and sparse taxonomy leaf suggest these areas remain underexplored in the broader literature.

Given the limited thirty-candidate search and the paper's placement in a singleton taxonomy leaf, the analysis suggests Trinity introduces mechanisms not prominently represented in the examined prior work. However, the sparse population of the 'Evolved and Learned Coordination Strategies' category and the absence of sibling papers indicate this assessment is based on a narrow slice of the literature. A more comprehensive search across adjacent branches—particularly in dynamic orchestration and hierarchical coordination—would be necessary to fully contextualize Trinity's contributions against the field's complete landscape.

Taxonomy

Core-task Taxonomy Papers

Claimed Contributions

Contribution Candidate Papers Compared

Refutable Paper

Research Landscape Overview

Core task: orchestrating collaboration among large language models through lightweight coordination. The field has evolved into a rich landscape of approaches that balance centralized control with distributed autonomy. Centralized orchestration architectures, exemplified by frameworks like MetaGPT[1] and hierarchical designs, provide structured coordination through explicit role assignment and workflow management. In contrast, decentralized and distributed coordination explores emergent collaboration patterns, as seen in Internet of Agents[9] and self-organizing networks. LLM routing and selection mechanisms address the challenge of dynamically choosing among models or strategies, with evolved and learned coordination strategies representing a particularly active subfield where systems adapt their orchestration policies over time. Function and tool orchestration focuses on managing external capabilities, while multimodal and cross-domain orchestration extends coordination across diverse data types and application boundaries. Resource-constrained deployment branches, including edge-focused work like Llmind IoT[3], tackle efficiency under hardware limitations, complemented by inference optimization techniques that reduce computational overhead. Within the routing and selection landscape, a key tension emerges between hand-crafted heuristics and adaptive strategies that learn from experience. Evolved and learned coordination strategies represent a frontier where systems like Evolving Orchestration[2] and MARCO[5] develop dynamic routing policies, contrasting with static assignment schemes. Trinity[0] sits squarely in this evolved coordination cluster, emphasizing lightweight mechanisms that adapt collaboration patterns without heavy training overhead. Compared to Training-Free Orchestration[4], which avoids learning entirely, Trinity[0] occupies a middle ground by incorporating evolutionary or feedback-driven adjustments while maintaining computational efficiency. This positioning reflects broader trade-offs in the field: whether to invest in upfront learning for better long-term coordination versus deploying simpler, more interpretable routing rules that sacrifice some adaptability for immediate deployment and transparency.

Claimed Contributions

Lightweight coordinator for LLM orchestration

10 retrieved papers

The authors introduce a lightweight coordination mechanism that uses a small language model (0.6B parameters) with a tiny head (under 20K total learnable parameters) to orchestrate multiple diverse LLMs. This coordinator extracts rich contextual signals from hidden states to make effective delegation decisions without requiring weight merging or architectural compatibility.

10 retrieved papers

Tri-role coordination protocol

10 retrieved papers

The authors propose a multi-turn coordination protocol where the coordinator assigns one of three distinct roles to selected LLMs: Thinker (for strategizing and planning), Worker (for execution), or Verifier (for evaluation). This design offloads complex skill acquisition from the coordinator to the orchestrated agents.

10 retrieved papers

Evolutionary strategy training methodology

10 retrieved papers

The authors develop a training methodology using separable Covariance Matrix Adaptation Evolution Strategy (sep-CMA-ES) to optimize the coordinator. They provide theoretical and empirical evidence that this approach substantially outperforms reinforcement learning, imitation learning, and random search under high dimensionality and strict budget constraints.

10 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

Within the taxonomy built over the current TopK core-task papers, the original paper is assigned to a leaf with no direct siblings and no cousin branches under the same grandparent topic. In this retrieved landscape, it appears structurally isolated, which is one partial signal of novelty, but still constrained by search coverage and taxonomy granularity.

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

Lightweight coordinator for LLM orchestration

[3] Llmind 2.0: Distributed iot automation with natural language m2m communication and lightweight llm agents PDF

Cannot Refute

[4] Training-Free Multimodal Large Language Model Orchestration PDF

Cannot Refute

[7] Llm-powered hierarchical language agent for real-time human-ai coordination PDF

Cannot Refute

[19] LightRouter: Towards Efficient LLM Collaboration with Minimal Overhead PDF

Cannot Refute

[68] Purifying Large Language Models by Ensembling a Small Language Model PDF

Cannot Refute

[69] LoRA ensembles for large language model fine-tuning PDF

Cannot Refute

[70] From LLM-anation to LLM-orchestrator: Coordinating Small Models for Data Labeling PDF

Cannot Refute

[71] An Emulator for Fine-Tuning Large Language Models using Small Language Models PDF

Cannot Refute

[72] Lowering Costs and Increasing Benefits Through the Ensemble of LLMs and Machine Learning Models PDF

Cannot Refute

[73] Decision-Making Large Language Model for Wireless Communication: A Comprehensive Survey on Key Techniques PDF

Cannot Refute

Contribution

Tri-role coordination protocol

[50] Mao: A framework for process model generation with multi-agent orchestration PDF

Cannot Refute

[51] Peer review as a multi-turn and long-context dialogue with role-based interactions PDF

Cannot Refute

[52] Self-Resource Allocation in Multi-Agent LLM Systems PDF

Cannot Refute

[53] Large Language Model Agents PDF

Cannot Refute

[54] Pre-Act: Multi-Step Planning and Reasoning Improves Acting in LLM Agents PDF

Cannot Refute

[55] RODE: Learning Roles to Decompose Multi-Agent Tasks PDF

Cannot Refute

[56] Dynamic LLM-Agent Network: An LLM-agent Collaboration Framework with Agent Team Optimization PDF

Cannot Refute

[57] Know the Ropes: A Heuristic Strategy for LLM-based Multi-Agent System Design PDF

Cannot Refute

[58] Stance Detection with Collaborative Role-Infused LLM-Based Agents PDF

Cannot Refute

[59] Traceability and Accountability in Role-Specialized Multi-Agent LLM Pipelines PDF

Cannot Refute

Contribution

Evolutionary strategy training methodology

[2] Multi-Agent Collaboration via Evolving Orchestration PDF

Cannot Refute

[15] Agentnet: Decentralized evolutionary coordination for llm-based multi-agent systems PDF

Cannot Refute

[60] Lero: Llm-driven evolutionary framework with hybrid rewards and enhanced observation for multi-agent reinforcement learning PDF

Cannot Refute

[61] Evoagent: Towards automatic multi-agent generation via evolutionary algorithms PDF

Cannot Refute

[62] A survey of self-evolving agents: On path to artificial super intelligence PDF

Cannot Refute

[63] Evo-marl: Co-evolutionary multi-agent reinforcement learning for internalized safety PDF

Cannot Refute

[64] Heterogeneous multi-agent zero-shot coordination by coevolution PDF

Cannot Refute

[65] Evolutionary and coevolutionary multi-agent design choices and dynamics PDF

Cannot Refute

[66] Transformer guided coevolution: Improved team selection in multiagent adversarial team games PDF

Cannot Refute

[67] Evolutionary multi-agent systems PDF

Cannot Refute

Trinity: An Evolved LLM Coordinator

Overview

Overall Novelty Assessment

Taxonomy

Research Landscape Overview

Claimed Contributions

Core Task Comparisons

Contribution Analysis

Lightweight coordinator for LLM orchestration

[3] Llmind 2.0: Distributed iot automation with natural language m2m communication and lightweight llm agents PDF

[4] Training-Free Multimodal Large Language Model Orchestration PDF

[7] Llm-powered hierarchical language agent for real-time human-ai coordination PDF

[19] LightRouter: Towards Efficient LLM Collaboration with Minimal Overhead PDF

[68] Purifying Large Language Models by Ensembling a Small Language Model PDF

[69] LoRA ensembles for large language model fine-tuning PDF

[70] From LLM-anation to LLM-orchestrator: Coordinating Small Models for Data Labeling PDF

[71] An Emulator for Fine-Tuning Large Language Models using Small Language Models PDF

[72] Lowering Costs and Increasing Benefits Through the Ensemble of LLMs and Machine Learning Models PDF

[73] Decision-Making Large Language Model for Wireless Communication: A Comprehensive Survey on Key Techniques PDF

Tri-role coordination protocol

[50] Mao: A framework for process model generation with multi-agent orchestration PDF

[51] Peer review as a multi-turn and long-context dialogue with role-based interactions PDF

[52] Self-Resource Allocation in Multi-Agent LLM Systems PDF

[53] Large Language Model Agents PDF

[54] Pre-Act: Multi-Step Planning and Reasoning Improves Acting in LLM Agents PDF

[55] RODE: Learning Roles to Decompose Multi-Agent Tasks PDF

[56] Dynamic LLM-Agent Network: An LLM-agent Collaboration Framework with Agent Team Optimization PDF

[57] Know the Ropes: A Heuristic Strategy for LLM-based Multi-Agent System Design PDF

[58] Stance Detection with Collaborative Role-Infused LLM-Based Agents PDF

[59] Traceability and Accountability in Role-Specialized Multi-Agent LLM Pipelines PDF

Evolutionary strategy training methodology

[2] Multi-Agent Collaboration via Evolving Orchestration PDF

[15] Agentnet: Decentralized evolutionary coordination for llm-based multi-agent systems PDF

[60] Lero: Llm-driven evolutionary framework with hybrid rewards and enhanced observation for multi-agent reinforcement learning PDF

[61] Evoagent: Towards automatic multi-agent generation via evolutionary algorithms PDF

[62] A survey of self-evolving agents: On path to artificial super intelligence PDF

[63] Evo-marl: Co-evolutionary multi-agent reinforcement learning for internalized safety PDF

[64] Heterogeneous multi-agent zero-shot coordination by coevolution PDF

[65] Evolutionary and coevolutionary multi-agent design choices and dynamics PDF

[66] Transformer guided coevolution: Improved team selection in multiagent adversarial team games PDF

[67] Evolutionary multi-agent systems PDF

Table of Contents