Trinity: An Evolved LLM Coordinator

ICLR 2026 Conference SubmissionAnonymous Authors
evolutionary strategiesmulti-agent LLM systemsrole-based delegationlogits-to-agent mapping
Abstract:

Combining diverse foundation models is promising, but weight-merging is limited by mismatched architectures and closed APIs. Trinity addresses this with a lightweight coordinator that orchestrates collaboration among large language models (LLMs). The coordinator, comprising a compact language model (0.6\approx 0.6B parameters) and a lightweight head (10\approx 10K parameters), is optimized with an evolutionary strategy for efficient and adaptive delegation. Trinity processes queries over multiple turns, where at each turn the coordinator assigns one of three roles (Thinker, Worker, or Verifier) to a selected LLM, effectively offloading complex skill acquisition from the coordinator itself. Extensive experiments demonstrate that Trinity consistently outperforms individual models and existing methods in various tasks, including coding, math, reasoning, and domain knowledge, while robustly generalizing to out-of-distribution tasks. On established benchmarks, Trinity achieves state-of-the-art performance, including a new record of 86.2%86.2\% on LiveCodeBench. Theoretical and empirical analyses highlight two key factors driving this success: (1) the coordinator’s hidden-state representations provide rich contextualization of inputs, and (2) under high dimensionality and strict budget constraints, the separable Covariance Matrix Adaptation Evolution Strategy algorithm provides substantial advantages over RL, imitation learning, and random search, leveraging potential block-ε\varepsilon-separability.

Disclaimer
This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.
NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.
If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

Trinity proposes a lightweight coordinator (approximately 0.6B parameters plus a 10K-parameter head) that orchestrates collaboration among multiple LLMs through dynamic role assignment across Thinker, Worker, and Verifier functions. The paper positions itself within the 'Evolved and Learned Coordination Strategies' leaf of the taxonomy, which currently contains only this single paper as a sibling. This leaf sits under the broader 'LLM Routing and Selection' branch, indicating a relatively sparse research direction focused specifically on adaptive coordination policies learned through evolutionary or reinforcement-based methods rather than static routing heuristics.

The taxonomy reveals that Trinity's approach bridges multiple neighboring research areas. It shares conceptual territory with 'Dynamic and Adaptive Orchestration' under centralized architectures, which includes systems that adjust strategies at runtime, and with 'Hierarchical and Role-Based Coordination', which emphasizes explicit role differentiation. However, Trinity diverges by using evolutionary strategies for policy optimization rather than predefined workflows or hierarchical control structures. The 'Query-Specific Model Selection and Routing' leaf contains methods for dynamic LLM selection, but these typically lack the multi-turn, role-based coordination protocol that Trinity employs. This positioning suggests Trinity occupies a niche intersection between adaptive routing and structured multi-agent collaboration.

Among the thirty candidates examined across three contributions, none were identified as clearly refuting Trinity's core claims. The 'Lightweight coordinator for LLM orchestration' contribution examined ten candidates with zero refutable overlaps, as did the 'Tri-role coordination protocol' and 'Evolutionary strategy training methodology' contributions. This absence of refutation reflects the limited search scope rather than definitive novelty: the analysis covers top-K semantic matches and citation expansion, not an exhaustive survey. The tri-role protocol and evolutionary training appear particularly distinctive within this sample, though the small candidate pool and sparse taxonomy leaf suggest these areas remain underexplored in the broader literature.

Given the limited thirty-candidate search and the paper's placement in a singleton taxonomy leaf, the analysis suggests Trinity introduces mechanisms not prominently represented in the examined prior work. However, the sparse population of the 'Evolved and Learned Coordination Strategies' category and the absence of sibling papers indicate this assessment is based on a narrow slice of the literature. A more comprehensive search across adjacent branches—particularly in dynamic orchestration and hierarchical coordination—would be necessary to fully contextualize Trinity's contributions against the field's complete landscape.

Taxonomy

Core-task Taxonomy Papers
49
3
Claimed Contributions
30
Contribution Candidate Papers Compared
0
Refutable Paper

Research Landscape Overview

Core task: orchestrating collaboration among large language models through lightweight coordination. The field has evolved into a rich landscape of approaches that balance centralized control with distributed autonomy. Centralized orchestration architectures, exemplified by frameworks like MetaGPT[1] and hierarchical designs, provide structured coordination through explicit role assignment and workflow management. In contrast, decentralized and distributed coordination explores emergent collaboration patterns, as seen in Internet of Agents[9] and self-organizing networks. LLM routing and selection mechanisms address the challenge of dynamically choosing among models or strategies, with evolved and learned coordination strategies representing a particularly active subfield where systems adapt their orchestration policies over time. Function and tool orchestration focuses on managing external capabilities, while multimodal and cross-domain orchestration extends coordination across diverse data types and application boundaries. Resource-constrained deployment branches, including edge-focused work like Llmind IoT[3], tackle efficiency under hardware limitations, complemented by inference optimization techniques that reduce computational overhead. Within the routing and selection landscape, a key tension emerges between hand-crafted heuristics and adaptive strategies that learn from experience. Evolved and learned coordination strategies represent a frontier where systems like Evolving Orchestration[2] and MARCO[5] develop dynamic routing policies, contrasting with static assignment schemes. Trinity[0] sits squarely in this evolved coordination cluster, emphasizing lightweight mechanisms that adapt collaboration patterns without heavy training overhead. Compared to Training-Free Orchestration[4], which avoids learning entirely, Trinity[0] occupies a middle ground by incorporating evolutionary or feedback-driven adjustments while maintaining computational efficiency. This positioning reflects broader trade-offs in the field: whether to invest in upfront learning for better long-term coordination versus deploying simpler, more interpretable routing rules that sacrifice some adaptability for immediate deployment and transparency.

Claimed Contributions

Lightweight coordinator for LLM orchestration

The authors introduce a lightweight coordination mechanism that uses a small language model (0.6B parameters) with a tiny head (under 20K total learnable parameters) to orchestrate multiple diverse LLMs. This coordinator extracts rich contextual signals from hidden states to make effective delegation decisions without requiring weight merging or architectural compatibility.

10 retrieved papers
Tri-role coordination protocol

The authors propose a multi-turn coordination protocol where the coordinator assigns one of three distinct roles to selected LLMs: Thinker (for strategizing and planning), Worker (for execution), or Verifier (for evaluation). This design offloads complex skill acquisition from the coordinator to the orchestrated agents.

10 retrieved papers
Evolutionary strategy training methodology

The authors develop a training methodology using separable Covariance Matrix Adaptation Evolution Strategy (sep-CMA-ES) to optimize the coordinator. They provide theoretical and empirical evidence that this approach substantially outperforms reinforcement learning, imitation learning, and random search under high dimensionality and strict budget constraints.

10 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

Within the taxonomy built over the current TopK core-task papers, the original paper is assigned to a leaf with no direct siblings and no cousin branches under the same grandparent topic. In this retrieved landscape, it appears structurally isolated, which is one partial signal of novelty, but still constrained by search coverage and taxonomy granularity.

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

Lightweight coordinator for LLM orchestration

The authors introduce a lightweight coordination mechanism that uses a small language model (0.6B parameters) with a tiny head (under 20K total learnable parameters) to orchestrate multiple diverse LLMs. This coordinator extracts rich contextual signals from hidden states to make effective delegation decisions without requiring weight merging or architectural compatibility.

Contribution

Tri-role coordination protocol

The authors propose a multi-turn coordination protocol where the coordinator assigns one of three distinct roles to selected LLMs: Thinker (for strategizing and planning), Worker (for execution), or Verifier (for evaluation). This design offloads complex skill acquisition from the coordinator to the orchestrated agents.

Contribution

Evolutionary strategy training methodology

The authors develop a training methodology using separable Covariance Matrix Adaptation Evolution Strategy (sep-CMA-ES) to optimize the coordinator. They provide theoretical and empirical evidence that this approach substantially outperforms reinforcement learning, imitation learning, and random search under high dimensionality and strict budget constraints.

Trinity: An Evolved LLM Coordinator | Novelty Validation