Task Tokens: A Flexible Approach to Adapting Behavior Foundation Models

ICLR 2026 Conference SubmissionAnonymous Authors

OpenReview Score: 7.0 Download Report PDF

Reinforcement LearningHierarchial Reinforcement LearningBehavior Foundation ModelsHumanoid Control

Recent advancements in imitation learning for robotic control have led to transformer-based behavior foundation models (BFMs) that enable multi-modal, human-like control for humanoid agents. These models generate solutions when conditioned on high-level goals or prompts, for example, walking to a coordinate when conditioned on the position of the robot's pelvis. While excelling at zero-shot generation of robust behaviors, BFMs often require meticulous prompt engineering for specific tasks, potentially yielding suboptimal results. In this work, we introduce ``Task Tokens'' - a method to effectively tailor BFMs to specific tasks while preserving their flexibility. Our approach integrates naturally within the transformer architecture of BFMs. Task Tokens trains a task-specific encoder (tokenizer), with the original BFM remaining untouched. Our method reduces trainable parameters per task by up to $\times 125$ and converges up to $\times 6$ faster compared to standard baselines. In addition, by keeping the original BFM unchanged, Task Tokens enables utilizing the pre-existing encoders. This allows incorporating user-defined priors, balancing reward design and prompt engineering. We demonstrate Task Tokens' efficacy across various tasks, including out-of-distribution scenarios, and show their compatibility with other prompting modalities. Our results suggest that Task Tokens offer a promising approach for adapting BFMs to specific control tasks while retaining their generalization capabilities.

Abstract:

Disclaimer

This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.

NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.

If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper introduces Task Tokens, a method for adapting behavior foundation models (BFMs) to specific tasks by training a task-specific encoder while keeping the original BFM frozen. Within the taxonomy, this work resides in the Prompt-Based and Token-Based Adaptation leaf, which contains only three papers total. This is a relatively sparse research direction compared to broader branches like Domain-Specific Adaptation Applications or Full Model Fine-Tuning. The sibling papers in this leaf explore related prompt-based mechanisms, suggesting that token-based adaptation for behavioral control is an emerging but not yet crowded area.

The taxonomy reveals that Task Tokens sits at the intersection of Parameter-Efficient Adaptation Methods and Behavioral Foundation Models. Neighboring leaves include Memory-Efficient and Zeroth-Order Optimization (which addresses forward-only adaptation) and Humanoid and Robotic Control (which focuses on whole-body control architectures). The scope note for Prompt-Based Adaptation explicitly excludes methods that update model weights, positioning Task Tokens as a pure conditioning approach. This distinguishes it from full fine-tuning branches and aligns it with works that manipulate input representations rather than internal parameters.

Among the three contributions analyzed, the parameter-efficiency claim examined ten candidates and found six potentially refutable prior works, indicating substantial overlap with existing parameter-efficient methods in the broader literature. The core Task Tokens mechanism examined five candidates with zero refutations, suggesting greater novelty in the specific application to behavioral control. The hybrid control paradigm examined ten candidates with no refutations, though this may reflect the limited search scope (twenty-five total candidates) rather than definitive novelty. The analysis does not claim exhaustive coverage of all relevant prior work.

Based on the limited search scope, Task Tokens appears to occupy a relatively sparse niche within prompt-based adaptation for behavioral foundation models. The parameter-efficiency aspect shows more overlap with existing techniques, while the application to humanoid control and the hybrid control paradigm appear less explored. The analysis reflects top-K semantic matches and does not guarantee comprehensive coverage of all related work in robotics or transformer-based control.

Taxonomy

Core-task Taxonomy Papers

Claimed Contributions

Contribution Candidate Papers Compared

Refutable Paper

Research Landscape Overview

Core task: adapting behavior foundation models to specific tasks. The field has organized itself around several major branches that reflect different strategic emphases. Parameter-Efficient Adaptation Methods explore lightweight techniques—such as prompt-based and token-based approaches (e.g., Task Tokens[0], Self-regulating Prompts[4])—that modify only a small subset of parameters or inject learnable tokens to steer pre-trained models toward new objectives. Full Model Fine-Tuning encompasses end-to-end retraining strategies, including works that align models with human preferences (Human Preferences Fine-tuning[19]) or address domain-specific constraints (Fault Diagnosis Fine-tuning[5]). Domain-Specific Adaptation Applications demonstrate how foundation models are tailored to specialized contexts—ranging from health coaching (Health Coaching LLMs[7], Physical Activity Coaching[17]) and pathology (Pathology Foundation Survey[24], Free Lunch Pathology[25]) to robotics (Robotics Foundation Models[36]) and animal behavior analysis (Animal Behavior Vision[12], Elephant Vocalization Transfer[18]). Transfer Learning and Generalization investigates how knowledge acquired in one setting generalizes to new environments (Transfer Learning Code[11], Zero-Shot Dynamics Adaptation[35]), while Behavioral Foundation Models and Computer Vision Foundation Models address the architectures and pre-training regimes that underpin these systems. Federated and Distributed Adaptation (Federated Foundation Adaptation[9]) and Foundation Model Vulnerabilities and Security (Model Stealing Threats[16], Fine-tuning Compromises Safety[8]) round out the taxonomy by considering deployment constraints and adversarial risks. Across these branches, a recurring tension emerges between efficiency and expressiveness: parameter-efficient methods promise rapid, low-cost adaptation but may sacrifice task-specific performance, whereas full fine-tuning can achieve stronger alignment at the expense of computational overhead and potential safety degradation (Fine-tuning Compromises Safety[8]). Task Tokens[0] sits squarely within the Prompt-Based and Token-Based Adaptation cluster, proposing a mechanism to inject task-specific information without retraining the entire backbone—an approach closely related to Self-regulating Prompts[4] and Personalized Sequential Prompt[28], which similarly manipulate input representations to guide model behavior. Compared to Forward Pass Fine-tuning[3], which modifies activations during inference, Task Tokens[0] emphasizes learnable token embeddings that can be optimized offline and then deployed with minimal runtime cost. This positioning highlights an active line of inquiry: how to balance the modularity and scalability of prompt-based methods with the need for task-specific expressiveness, a question that also motivates recent work on parameter-efficient fine-tuning (Parameter-efficient Fine-tuning[29]) and domain-specific applications (Adapting LLMs Downstream[6]).

Claimed Contributions

Task Tokens method for adapting behavior foundation models

5 retrieved papers

The authors propose Task Tokens, a novel approach that trains a task-specific encoder (tokenizer) to generate specialized token representations for each new task, while keeping the original behavior foundation model frozen. This enables task-specific adaptation without fine-tuning the entire foundation model, preserving its zero-shot capabilities and generalization.

5 retrieved papers

Parameter-efficient and fast-converging adaptation approach

Can Refute

10 retrieved papers

The method achieves significant efficiency gains by requiring only approximately 200K trainable parameters per task (compared to millions in baseline methods) and demonstrates faster convergence during training. This makes the approach highly scalable for adapting foundation models to multiple downstream tasks.

10 retrieved papers

Can Refute

Hybrid control paradigm combining user-defined priors and learned optimization

10 retrieved papers

The approach establishes a hybrid control framework where users can provide high-level behavioral priors via goals (such as walk toward object while facing forward), which are then enhanced by task-specific embeddings learned through reinforcement learning to optimize dense rewards. This integration leverages the tokenization framework of goal-conditioned behavior foundation models.

10 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

[4] Self-regulating prompts: Foundational model adaptation without forgetting PDF

Muhammad Uzair Khattak, Syed Talal Wasim, Muzammal Naseer, Salman Khan, Ming-Hsuan Yang, Salman Siddique Khan, Fahad Shahbaz Khan, Ming Yang, F. Khan (2023)

[28] Personalized prompt for sequential recommendation PDF

Yiqing Wu, Ruobing Xie, Yongchun Zhu, Fuzhen Zhuang, Xu Zhang, Leyu Lin, Qing He (2024)

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

Task Tokens method for adapting behavior foundation models

[59] Enhancing Generalization in Vision-Language-Action Models by Preserving Pretrained Representations PDF

Cannot Refute

[60] Multi-Task Driven Adapter-Based Foundation Model for Locomotion Prediction in Virtual Reality PDF

Cannot Refute

[61] Token-Level Adaptation of LoRA Adapters for Downstream Task Generalization PDF

Cannot Refute

[62] Classifier Language Models: Unifying Sparse Finetuning and Adaptive Tokenization for Specialized Classification Tasks PDF

Cannot Refute

[63] AdapterBias: Parameter-efficient Token-dependent Representation Shift for Adapters in NLP Tasks PDF

Cannot Refute

Contribution

Parameter-efficient and fast-converging adaptation approach

[29] Parameter-efficient fine-tuning of large-scale pre-trained language models PDF

Can Refute

[51] Parameter-efficient fine-tuning for large models: A comprehensive survey PDF

Cannot Refute

[73] ACE-RL: Adaptive Constraint-Enhanced Reward for Long-form Generation Reinforcement Learning PDF

Cannot Refute

Task Tokens: A Flexible Approach to Adapting Behavior Foundation Models

Overview

Overall Novelty Assessment

Taxonomy

Research Landscape Overview

Claimed Contributions

Core Task Comparisons

[4] Self-regulating prompts: Foundational model adaptation without forgetting PDF

[28] Personalized prompt for sequential recommendation PDF

Contribution Analysis

Task Tokens method for adapting behavior foundation models

[59] Enhancing Generalization in Vision-Language-Action Models by Preserving Pretrained Representations PDF

[60] Multi-Task Driven Adapter-Based Foundation Model for Locomotion Prediction in Virtual Reality PDF

[61] Token-Level Adaptation of LoRA Adapters for Downstream Task Generalization PDF

[62] Classifier Language Models: Unifying Sparse Finetuning and Adaptive Tokenization for Specialized Classification Tasks PDF

[63] AdapterBias: Parameter-efficient Token-dependent Representation Shift for Adapters in NLP Tasks PDF

Parameter-efficient and fast-converging adaptation approach

[29] Parameter-efficient fine-tuning of large-scale pre-trained language models PDF

[51] Parameter-efficient fine-tuning for large models: A comprehensive survey PDF

[53] On the effectiveness of parameter-efficient fine-tuning PDF

[54] Sparse low-rank adaptation of pre-trained language models PDF

[55] Sensitivity-aware visual parameter-efficient fine-tuning PDF

[58] Lora: Low-rank adaptation of large language models. PDF

[13] The ultimate guide to fine-tuning llms from basics to breakthroughs: An exhaustive review of technologies, research, best practices, applied research challenges and â¦ PDF

[52] Parameter-efficient fine-tuning in large language models: a survey of methodologies PDF

[56] Towards efficient fine-tuning of pre-trained code models: An experimental study and beyond PDF

[57] Difffit: Unlocking transferability of large diffusion models via simple parameter-efficient fine-tuning PDF

Hybrid control paradigm combining user-defined priors and learned optimization

[64] Supporting task switching with reinforcement learning PDF

[65] Safe multi-agent reinforcement learning with natural language constraints PDF

[66] Online Learning of Human Constraints from Feedback in Shared Autonomy PDF

[67] Modeling Pedestrian Crossing Behavior: A Reinforcement Learning Approach With Sensory Motor Constraints PDF

[68] Learning human contribution preferences in collaborative human-robot tasks PDF

[69] Constrained human-ai cooperation: An inclusive embodied social intelligence challenge PDF

[70] A Framework for Inherently Safer AGI through Language-Mediated Active Inference PDF

[71] Knowledge-aware reasoning with self-supervised reinforcement learning for explainable recommendation in MOOCs PDF

[72] Constraint-aware intent estimation for dynamic human-robot object co-manipulation PDF

[73] ACE-RL: Adaptive Constraint-Enhanced Reward for Long-form Generation Reinforcement Learning PDF

Table of Contents

[13] The ultimate guide to fine-tuning llms from basics to breakthroughs: An exhaustive review of technologies, research, best practices, applied research challenges and â¦ PDF