MemAgent: Reshaping Long-Context LLM with Multi-Conv RL-based Memory Agent
Overview
Taxonomy
Research Landscape Overview
Claimed Contributions
The authors propose MEMAGENT, a new agent workflow that handles long-context tasks by dividing documents into segments and iteratively updating a fixed-length memory using an overwrite strategy. This approach enables processing of arbitrarily long texts with linear time complexity while maintaining performance.
The authors extend the DAPO reinforcement learning algorithm to create Multi-Conv DAPO, which optimizes memory capabilities end-to-end by treating each context-independent conversation as an optimization objective. This enables training of agent workflows with multiple rounds of memory updates across independent contexts.
The authors introduce a reinforcement learning method that enables LLMs to maintain and update a fixed-length memory dynamically as they process text segment-by-segment. This allows the model to handle arbitrary text lengths while maintaining linear time complexity during processing.
Core Task Comparisons
Comparisons with papers in the same taxonomy category
Contribution Analysis
Detailed comparisons for each claimed contribution
MEMAGENT agent workflow with overwrite-based memory management
The authors propose MEMAGENT, a new agent workflow that handles long-context tasks by dividing documents into segments and iteratively updating a fixed-length memory using an overwrite strategy. This approach enables processing of arbitrarily long texts with linear time complexity while maintaining performance.
[28] Hiagent: Hierarchical working memory management for solving long-horizon agent tasks with large language model PDF
[32] Long context scaling: Divide and conquer via multi-agent question-driven collaboration PDF
[51] Evaluating memory in llm agents via incremental multi-turn interactions PDF
[52] State and Memory is All You Need for Robust and Reliable AI Agents PDF
[53] Agentic Troubleshooting Guide Automation for Incident Management PDF
[54] Memory-Augmented Agent Training for Business Document Understanding PDF
[55] Dialog Generation Using Multi-Turn Reasoning Neural Networks PDF
[56] DelTA: An Online Document-Level Translation Agent Based on Multi-Level Memory PDF
[57] Step up your game: A research on two/multi-step summarisation of long, regulatory documents PDF
Multi-Conv DAPO algorithm for end-to-end memory optimization
The authors extend the DAPO reinforcement learning algorithm to create Multi-Conv DAPO, which optimizes memory capabilities end-to-end by treating each context-independent conversation as an optimization objective. This enables training of agent workflows with multiple rounds of memory updates across independent contexts.
[58] Secom: On memory construction and retrieval for personalized conversational agents PDF
[59] Doctoragent-rl: A multi-agent collaborative reinforcement learning system for multi-turn clinical dialogue PDF
[60] Experience replay-based deep reinforcement learning for dialogue management optimisation PDF
[61] High-Resolution Visual Reasoning via Multi-Turn Grounding-Based Reinforcement Learning PDF
[62] In prospect and retrospect: Reflective memory management for long-term personalized dialogue agents PDF
[63] Planning without Search: Refining Frontier LLMs with Offline Goal-Conditioned RL PDF
[64] Context-lite multi-turn reinforcement learning for LLM agents PDF
[65] RAIDEN-R1: Improving Role-awareness of LLMs via GRPO with Verifiable Reward PDF
[66] History-Aware Cross-Attention Reinforcement: Self-Supervised Multi Turn and Chain-of-Thought Fine-Tuning with vLLM PDF
[67] Memory-T1: Reinforcement Learning for Temporal Reasoning in Multi-session Agents PDF
RL-based approach for dynamically updated fixed-length memory in LLMs
The authors introduce a reinforcement learning method that enables LLMs to maintain and update a fixed-length memory dynamically as they process text segment-by-segment. This allows the model to handle arbitrary text lengths while maintaining linear time complexity during processing.