Adaptive Social Learning via Mode Policy Optimization for Language Agents

ICLR 2026 Conference SubmissionAnonymous Authors
Social IntelligeneLarge Language ModelsAdaptive Social Learning
Abstract:

Effective social intelligence simulation requires language agents to dynamically adjust reasoning depth, a capability notably absent in current studies. Existing methods either lack explicit reasoning or employ lengthy Chain-of-Thought reasoning uniformly across all scenarios, resulting in excessive token usage and inflexible social behaviors in tasks such as negotiation or collaboration. To address this, we propose an A\textbf{A}daptive S\textbf{S}ocial L\textbf{L}earning (ASL\textbf{ASL}) framework in this paper, aiming to improve the adaptive reasoning ability of language agents in dynamic social interactions. To this end, we first identify the hierarchical reasoning modes under such context, ranging from intuitive response to deep deliberation based on the cognitive control theory. We then develop the A\textbf{A}daptive M\textbf{M}ode P\textbf{P}olicy O\textbf{O}ptimization (AMPO\textbf{AMPO}) algorithm to learn the context-aware mode adaptation and reasoning. Our framework advances existing research in three key aspects: (1) Multi-granular reasoning mode design, (2) Context-aware mode switching in rich social interaction, and (3) Token-efficient reasoning with depth adaptation. Extensive experiments on the benchmark social intelligence environment verify that ASL achieves 15.6% higher task performance than GPT-4o. Notably, our AMPO outperforms GRPO by 7.0% with 32.8% shorter thinking chains, demonstrating the advantages of our AMPO and the learned adaptive reasoning ability over GRPO's solution.

Disclaimer
This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.
NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.
If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper proposes an Adaptive Social Learning (ASL) framework enabling language agents to dynamically adjust reasoning depth in social interactions, from intuitive responses to deep deliberation. It resides in the Language Agent Adaptive Reasoning Systems leaf, which contains only two papers total. This sparse population suggests the specific combination of language-model-based agents with adaptive reasoning depth in social tasks represents an emerging rather than crowded research direction within the broader computational agent frameworks branch.

The taxonomy reveals that neighboring leaves address trajectory prediction, embodied agents, and computational models of social norms, but none explicitly tackle adaptive reasoning depth in language-based social agents. The broader Computational Agent Frameworks branch contrasts sharply with the Human Cognitive and Social Processes branch, which contains nine papers on cognitive flexibility in educational contexts alone. This structural asymmetry indicates that while human adaptive reasoning is well-studied, computational implementations for language agents remain relatively underexplored, particularly those integrating hierarchical reasoning modes with context-aware switching.

Among twenty-six candidates examined, the ASL framework contribution shows one refutable candidate from ten examined, while the AMPO algorithm and hierarchical reasoning modes show zero refutations from six and ten candidates respectively. The limited search scope means these statistics reflect top-K semantic matches rather than exhaustive coverage. The AMPO algorithm and reasoning mode design appear more novel within this bounded search, whereas the broader ASL framework concept encounters at least one overlapping prior work among the examined candidates.

Based on the top-twenty-six semantic matches and taxonomy structure, the work addresses a sparsely populated research direction with limited direct prior work. The analysis does not cover the full literature landscape, particularly domain-specific applications or recent preprints outside the search scope. The hierarchical reasoning modes and token-efficient adaptation appear to offer substantive contributions, though the framework-level novelty is tempered by at least one identified overlap.

Taxonomy

Core-task Taxonomy Papers
50
3
Claimed Contributions
26
Contribution Candidate Papers Compared
1
Refutable Paper

Research Landscape Overview

Core task: adaptive reasoning in dynamic social interactions. This field examines how agents—whether computational, human, or animal—adjust their cognitive strategies and behaviors in response to evolving social contexts. The taxonomy reveals six major branches that together capture the breadth of this challenge. Computational Agent Frameworks for Social Interaction focus on building artificial systems capable of flexible reasoning and learning in multi-agent environments, often leveraging language models and reinforcement learning architectures such as Adaptive Mobile Agent[3] and Adaptive Thinking Mode[1]. Human Cognitive and Social Processes explore psychological mechanisms underlying flexibility, including cognitive flexibility training interventions (Cognitive Flexibility Training[11], Cognitive Flexibility Support[10]) and the interplay between emotion regulation, mental flexibility, and social competence (Emotion Regulation Mediation[2], Cognitive Flexibility Anxiety[24]). Neuroscience and Biological Mechanisms investigate neural substrates and developmental factors, such as hippocampal contributions to social learning (Hippocampus Social Learning[14]) and the impact of early stress on adaptive capacities (Early Stress Impairment[18]). Theoretical and Methodological Frameworks provide formal models, including Bayesian approaches to joint action (Bayesian Joint Action[43]) and complex adaptive systems theory (Complex Adaptive Systems[36]). Applied and Domain-Specific Interaction Studies address real-world settings like therapeutic responsiveness (Therapist Interpersonal Responsiveness[27]) and driver interactions (Driver Social Interactions[23]), while Animal and Comparative Studies examine adaptive foraging and environmental enrichment effects (Adaptive Social Foraging[41], Environmental Enrichment Flexibility[45]). Several active lines of work highlight key trade-offs and open questions. One prominent theme contrasts top-down cognitive training interventions aimed at enhancing flexibility with bottom-up investigations of how environmental and affective factors shape adaptive capacities, raising questions about the relative malleability of these processes across development and contexts. Another tension emerges between formal computational models that seek to capture reasoning dynamics in tractable frameworks and empirical studies documenting the messy, context-dependent nature of real social interactions. Adaptive Social Learning[0] sits squarely within the Computational Agent Frameworks branch, specifically among Language Agent Adaptive Reasoning Systems. Its emphasis on learning-driven adaptation in social contexts aligns closely with Adaptive Thinking Mode[1], which similarly explores how agents modulate reasoning strategies. Compared to more domain-specific applied work or neuroscience-focused studies, Adaptive Social Learning[0] prioritizes the design of general-purpose computational architectures that can flexibly adjust to diverse social scenarios, positioning it as a bridge between theoretical models of adaptive reasoning and practical agent deployment.

Claimed Contributions

Adaptive Social Learning (ASL) framework for language agents

The authors introduce ASL, a novel framework that enables language agents to dynamically adjust their reasoning depth in social interactions. It combines hierarchical reasoning modes inspired by cognitive control theory with reinforcement learning to achieve context-aware adaptive reasoning in dynamic social environments.

10 retrieved papers
Can Refute
Adaptive Mode Policy Optimization (AMPO) algorithm

The authors develop AMPO, a reinforcement learning algorithm that incorporates both mode-level and sample-level information into advantage estimation. This enables context-aware reasoning mode switching while improving token efficiency and flexible inference in social interactions.

6 retrieved papers
Hierarchical reasoning modes for social intelligence

The authors design a hierarchy of reasoning modes based on cognitive control theory, ranging from intuitive responses to deep deliberation. These modes enable multi-granular reasoning and context-aware mode switching in social interactions, addressing the limitation of uniform reasoning approaches.

10 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

Adaptive Social Learning (ASL) framework for language agents

The authors introduce ASL, a novel framework that enables language agents to dynamically adjust their reasoning depth in social interactions. It combines hierarchical reasoning modes inspired by cognitive control theory with reinforcement learning to achieve context-aware adaptive reasoning in dynamic social environments.

Contribution

Adaptive Mode Policy Optimization (AMPO) algorithm

The authors develop AMPO, a reinforcement learning algorithm that incorporates both mode-level and sample-level information into advantage estimation. This enables context-aware reasoning mode switching while improving token efficiency and flexible inference in social interactions.

Contribution

Hierarchical reasoning modes for social intelligence

The authors design a hierarchy of reasoning modes based on cognitive control theory, ranging from intuitive responses to deep deliberation. These modes enable multi-granular reasoning and context-aware mode switching in social interactions, addressing the limitation of uniform reasoning approaches.