Task-Agnostic Amortized Multi-Objective Optimization

ICLR 2026 Conference SubmissionAnonymous Authors
Multi-Objective OptimizationBayesian OptimizationTransformersNeural Processes
Abstract:

Balancing competing objectives is omnipresent across disciplines, from drug design to autonomous systems. Multi-objective Bayesian optimization is a promising solution for such expensive, black-box problems: it fits probabilistic surrogates and selects new designs via an acquisition function that balances exploration and exploitation. In practice, it requires tailored choices of surrogate and acquisition that rarely transfer to the next problem, is myopic when multi-step planning is often required, and adds refitting overhead, particularly in parallel or time-sensitive loops. We present TAMO, a fully amortized, universal policy for multi-objective black-box optimization. TAMO uses a transformer architecture that operates across varying input and objective dimensions, enabling pretraining on diverse corpora and transfer to new problems without retraining: at test time, the pretrained model proposes the next design with a single forward pass. We pretrain the policy with reinforcement learning to maximize cumulative hypervolume improvement over full trajectories, conditioning on the entire query history to approximate the Pareto frontier. Across synthetic benchmarks and real tasks, TAMO produces fast proposals, reducing proposal time by 50–1000× versus alternatives while matching or improving Pareto quality under tight evaluation budgets. These results show that transformers can perform multi-objective optimization entirely in-context, eliminating per-task surrogate fitting and acquisition engineering, and open a path to foundation-style, plug-and-play optimizers for scientific discovery workflows.

Disclaimer
This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.
NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.
If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

TAMO proposes a fully amortized transformer policy for multi-objective black-box optimization that operates across varying input and objective dimensions without per-task retraining. The paper sits in the 'Transformer-Based In-Context Multi-Objective Optimization' leaf, which contains only two papers including TAMO itself. This represents a relatively sparse research direction within the broader taxonomy, suggesting the work targets an emerging area where transformer-based amortized policies are applied to multi-objective settings with full in-context reasoning over query histories.

The taxonomy reveals that TAMO's immediate neighbors include 'Preferential Amortized Optimization' (learning from pairwise preferences) and more distant branches covering task-specific Bayesian methods, evolutionary algorithms, and domain-specific generative models. The scope note for TAMO's leaf explicitly excludes preferential feedback and single-objective approaches, positioning the work at the intersection of amortized policy learning and direct multi-objective evaluation. Nearby branches like 'Constrained and Scalable Bayesian Optimization' require per-task surrogate fitting, highlighting TAMO's departure from traditional Bayesian paradigms toward universal, pretrained policies.

Among thirty candidates examined, the analysis found limited prior work overlap. The core amortized policy contribution (Contribution A) examined ten candidates with zero refutations, and the dimension-agnostic architecture (Contribution B) similarly showed no clear prior work among ten candidates. However, the non-myopic trajectory-level reinforcement learning objective (Contribution C) identified one refutable candidate among ten examined, suggesting some existing work on multi-step planning in related optimization contexts. These statistics indicate that within the limited search scope, most contributions appear relatively novel, though the trajectory-level RL framing has at least one overlapping precedent.

Based on the top-thirty semantic matches and taxonomy structure, TAMO appears to occupy a sparsely populated niche combining transformer amortization with multi-objective black-box optimization. The single sibling paper and limited refutations suggest novelty within the examined scope, though the analysis does not cover exhaustive literature or domain-specific evolutionary or Bayesian methods outside the semantic search radius. The trajectory-level RL component shows the most prior work overlap among the three contributions analyzed.

Taxonomy

Core-task Taxonomy Papers
6
3
Claimed Contributions
30
Contribution Candidate Papers Compared
1
Refutable Paper

Research Landscape Overview

Core task: amortized multi-objective black-box optimization with transformers. The field addresses how to efficiently solve multiple optimization problems by learning reusable policies or models that generalize across tasks, rather than optimizing each problem from scratch. The taxonomy reveals four main branches: Amortized Policy Learning for Multi-Objective Optimization focuses on training neural policies (often transformer-based) that can propose solutions in-context or via learned mappings; Task-Specific Bayesian Optimization Frameworks emphasize surrogate modeling and acquisition strategies tailored to individual problem instances; Deep Learning-Enhanced Evolutionary Algorithms integrate neural components into population-based search; and Domain-Specific Generative Optimization targets specialized application areas such as molecular design or hardware synthesis. Works like In-Context Multi-Objective[3] and Preferential Amortized[6] illustrate how transformers can be leveraged to handle multi-objective trade-offs directly, while approaches such as Efficient Scalable Bayesian[5] and PABBO[2] represent more classical Bayesian paradigms augmented with modern scalability techniques. A central tension across these branches is the trade-off between task-agnostic generalization and domain-specific performance: amortized methods promise rapid adaptation to new objectives but may sacrifice the fine-grained tuning that task-specific Bayesian or evolutionary methods provide. Task-Agnostic Amortized[0] sits squarely within the transformer-based amortized policy learning branch, emphasizing in-context learning for multi-objective scenarios much like In-Context Multi-Objective[3]. Compared to In-Context Multi-Objective[3], which also explores transformer-driven multi-objective reasoning, Task-Agnostic Amortized[0] appears to push further on generalization across diverse black-box functions without requiring task-specific retraining. Meanwhile, works like Preferential Amortized[6] incorporate human preferences into the amortization process, and Transformer Multimodal[1] extends the paradigm to multimodal data, highlighting ongoing efforts to broaden the scope and applicability of learned optimization policies.

Claimed Contributions

TAMO: Fully amortized multi-objective optimization policy

The authors introduce TAMO, a transformer-based policy that performs multi-objective optimization through a single forward pass at test time, eliminating the need for per-task surrogate fitting and acquisition function optimization. The policy is pretrained using reinforcement learning to maximize cumulative hypervolume improvement over full trajectories.

10 retrieved papers
Dimension-agnostic transformer architecture

The authors develop a novel transformer architecture with a dimension-aggregating embedder that can handle varying input and output dimensionalities. This enables the model to be pretrained on heterogeneous tasks and transfer to new problems with different dimensions without requiring retraining.

10 retrieved papers
Non-myopic trajectory-level reinforcement learning objective

The authors formulate the optimization problem as a Markov decision process and train the policy using REINFORCE to optimize hypervolume-based rewards over entire trajectories rather than single-step gains. This encourages long-horizon planning instead of myopic one-step optimization typical in traditional acquisition functions.

10 retrieved papers
Can Refute

Core Task Comparisons

Comparisons with papers in the same taxonomy category

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

TAMO: Fully amortized multi-objective optimization policy

The authors introduce TAMO, a transformer-based policy that performs multi-objective optimization through a single forward pass at test time, eliminating the need for per-task surrogate fitting and acquisition function optimization. The policy is pretrained using reinforcement learning to maximize cumulative hypervolume improvement over full trajectories.

Contribution

Dimension-agnostic transformer architecture

The authors develop a novel transformer architecture with a dimension-aggregating embedder that can handle varying input and output dimensionalities. This enables the model to be pretrained on heterogeneous tasks and transfer to new problems with different dimensions without requiring retraining.

Contribution

Non-myopic trajectory-level reinforcement learning objective

The authors formulate the optimization problem as a Markov decision process and train the policy using REINFORCE to optimize hypervolume-based rewards over entire trajectories rather than single-step gains. This encourages long-horizon planning instead of myopic one-step optimization typical in traditional acquisition functions.