Social Agents: Collective Intelligence Improves LLM Predictions

ICLR 2026 Conference SubmissionAnonymous Authors
wisdom of crowdsLLMmultiagent systems
Abstract:

In human society, collective decision making has often outperformed the judgment of individuals. Classic examples range from estimating livestock weights to predicting elections and financial markets, where averaging many independent guesses often yields results more accurate than experts. These successes arise because groups bring together diverse perspectives, independent voices, and distributed knowledge, combining them in ways that cancel individual biases. This principle, known as the Wisdom of Crowds, underpins practices in forecasting, marketing, and preference modeling. Large Language Models (LLMs), however, typically produce a single definitive answer. While effective in many settings, this uniformity overlooks the diversity of human judgments shaping responses to ads, videos, and webpages. Inspired by how societies benefit from diverse opinions, we ask whether LLM predictions can be improved by simulating not one answer but many. We introduce Social Agents, a multi-agent framework that instantiates a synthetic society of human-like personas with diverse demographic (e.g., age, gender) and psychographic (e.g., values, interests) attributes. Each persona independently appraises a stimulus such as an advertisement, video, or webpage, offering both a quantitative score (e.g., click-through likelihood, recall score, likability) and a qualitative rationale. Aggregating these opinions produces a distribution of preferences that more closely mirrors real human crowds. Across eleven behavioral prediction tasks, Social Agents outperforms single-LLM baselines by up to 67.45% on simple judgments (e.g. webpage likability) and 9.88% on complex interpretive reasoning (e.g. video memorability). Social Agents’ individual persona predictions also align with human judgments, reaching Pearson correlations up to 0.71. These results position computational crowd simulation as a scalable, interpretable tool for improving behavioral prediction and supporting societal decision making.

Disclaimer
This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.
NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.
If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper introduces Social Agents, a multi-agent framework that simulates diverse human personas to improve behavioral prediction through collective intelligence. It resides in the LLM-Based Behavioral Prediction leaf, which contains only three papers total, including this work and two siblings (Simulating Society and ElliottAgents). This represents a relatively sparse research direction within the broader taxonomy of fifty papers, suggesting the application of LLM-driven multi-agent systems specifically to behavioral forecasting remains an emerging area compared to more established branches like trajectory prediction or reinforcement learning.

The taxonomy reveals that LLM-Based Behavioral Prediction sits within the larger LLM-Based Multi-Agent Systems and Collaboration branch, which also includes LLM-Driven Collaborative Frameworks focused on task coordination rather than prediction. Neighboring branches pursue fundamentally different approaches: Multi-Agent Trajectory and Motion Prediction emphasizes spatial forecasting using graph networks and diffusion models, while Collective Behavior and Emergent Phenomena studies group dynamics through simulation and theoretical models. The scope note explicitly excludes trajectory prediction, positioning this work as complementary to geometric reasoning methods while sharing conceptual ground with Collective Intelligence and Crowd Simulation studies.

Among twenty-six candidates examined across three contributions, none were identified as clearly refuting the work. The Social Agents framework examined nine candidates with zero refutable overlaps, the empirical evaluation across eleven tasks examined seven candidates with similar results, and the synthetic dataset contribution examined ten candidates without finding substantial prior work. This suggests that within the limited search scope—primarily top-K semantic matches and citation expansion—the specific combination of LLM-based persona simulation, collective aggregation mechanisms, and systematic evaluation across diverse behavioral tasks appears relatively unexplored, though the analysis does not claim exhaustive coverage of all potentially relevant literature.

The limited search scope and sparse taxonomy leaf indicate the work occupies a nascent research direction where LLM capabilities meet collective intelligence principles. The absence of refutable candidates among twenty-six examined papers suggests novelty within the sampled literature, though the small sibling count and focused search strategy mean substantial related work may exist outside the top-K semantic neighborhood or in adjacent application domains not captured by this taxonomy structure.

Taxonomy

Core-task Taxonomy Papers
50
3
Claimed Contributions
26
Contribution Candidate Papers Compared
0
Refutable Paper

Research Landscape Overview

Core task: Improving behavioral prediction using multi-agent collective intelligence. The field encompasses diverse approaches to understanding and forecasting how multiple agents—whether autonomous vehicles, software agents, or simulated populations—interact and evolve over time. The taxonomy reveals six major branches: Multi-Agent Trajectory and Motion Prediction focuses on spatial forecasting in domains like autonomous driving, often leveraging graph-based architectures such as AgentFormer[5] and collaborative frameworks like Cooperative Motion Prediction[13]. Multi-Agent Reinforcement Learning and Decision-Making addresses sequential decision problems where agents learn policies through interaction, exemplified by platforms like Magent Platform[6]. LLM-Based Multi-Agent Systems and Collaboration explores how large language models enable richer agent communication and coordination, as seen in AgentVerse Collaboration[3] and AgentVerse Emergent[4]. Collective Behavior and Emergent Phenomena investigates how simple rules yield complex group dynamics, from early work like Designing Emergent Behaviors[8] to recent studies on Emergent Collective Intelligence[37]. Explainability and Interpretability in Multi-Agent Systems tackles the challenge of making agent decisions transparent, while Domain Applications and Cross-Disciplinary Studies apply these methods to finance, supply chains, and social simulation. Recent activity highlights contrasting philosophies: trajectory prediction methods emphasize geometric reasoning and uncertainty quantification, whereas LLM-based approaches prioritize semantic understanding and flexible collaboration. Social Agents[0] sits squarely within the LLM-Based Behavioral Prediction cluster, leveraging language models to simulate nuanced social interactions and predict human-like behaviors in multi-agent settings. This positions it alongside works like Simulating Society[19] and ElliottAgents[27], which similarly use LLMs to model complex social phenomena, but Social Agents[0] emphasizes collective intelligence mechanisms that emerge from agent interactions rather than purely individual reasoning. The tension between data-driven motion forecasting (e.g., MotionLM[15]) and knowledge-grounded social simulation remains a central open question, with hybrid approaches like Hybrid Prediction[34] attempting to bridge these paradigms by combining neural trajectory models with symbolic reasoning about agent intentions.

Claimed Contributions

Social Agents multi-agent framework

The authors propose Social Agents, a framework that operationalizes the wisdom of crowds principle by creating ensembles of LLM-based persona agents with diverse demographic and psychographic characteristics. Each persona independently evaluates stimuli and provides quantitative predictions with qualitative rationales, which are then aggregated to produce collective judgments that mirror real human crowds.

9 retrieved papers
Empirical evaluation across eleven behavioral prediction tasks

The authors conduct a comprehensive evaluation of Social Agents across eleven diverse tasks spanning low-, medium-, and high-level construals based on Construal Level Theory. The framework demonstrates consistent improvements over single-LLM baselines and often exceeds task-specific trained models, showing that collective intelligence can improve LLM predictions across different cognitive domains.

7 retrieved papers
Synthetic dataset of persona-conditioned predictions

The authors release a dataset containing persona-conditioned predictions, definitions, and rationales generated by Social Agents across all eleven behavioral tasks. This dataset captures how diverse personas interact with and evaluate digital content, providing a resource for understanding collective intelligence in synthetic crowds.

10 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

Social Agents multi-agent framework

The authors propose Social Agents, a framework that operationalizes the wisdom of crowds principle by creating ensembles of LLM-based persona agents with diverse demographic and psychographic characteristics. Each persona independently evaluates stimuli and provides quantitative predictions with qualitative rationales, which are then aggregated to produce collective judgments that mirror real human crowds.

Contribution

Empirical evaluation across eleven behavioral prediction tasks

The authors conduct a comprehensive evaluation of Social Agents across eleven diverse tasks spanning low-, medium-, and high-level construals based on Construal Level Theory. The framework demonstrates consistent improvements over single-LLM baselines and often exceeds task-specific trained models, showing that collective intelligence can improve LLM predictions across different cognitive domains.

Contribution

Synthetic dataset of persona-conditioned predictions

The authors release a dataset containing persona-conditioned predictions, definitions, and rationales generated by Social Agents across all eleven behavioral tasks. This dataset captures how diverse personas interact with and evaluate digital content, providing a resource for understanding collective intelligence in synthetic crowds.