Abstract:

Converging evidence suggests that human systems of semantic categories achieve near-optimal compression via the Information Bottleneck (IB) complexity-accuracy tradeoff. Large language models (LLMs) are not trained for this objective, which raises the question: are LLMs capable of evolving efficient human-aligned semantic systems? To address this question, we focus on color categorization --- a key testbed of cognitive theories of categorization with uniquely rich human data --- and replicate with LLMs two influential human studies. First, we conduct an English color-naming study, showing that LLMs vary widely in their complexity and English-alignment, with larger instruction-tuned models achieving better alignment and IB-efficiency. Second, to test whether these LLMs simply mimic patterns in their training data or actually exhibit a human-like inductive bias toward IB-efficiency, we simulate cultural evolution of pseudo color-naming systems in LLMs via a method we refer to as Iterated in-Context Language Learning (IICLL). We find that akin to humans, LLMs iteratively restructure initially random systems towards greater IB-efficiency. However, only a model with strongest in-context capabilities (Gemini 2.0) is able to recapitulate the wide range of near-optimal IB-tradeoffs observed in humans, while other state-of-the-art models converge to low-complexity solutions. These findings demonstrate how human-aligned semantic categories can emerge in LLMs via the same fundamental principle that underlies semantic efficiency in humans.

Disclaimer
This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.
NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.
If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper investigates whether large language models can evolve efficient human-aligned semantic systems through the lens of Information Bottleneck (IB) theory, focusing on color categorization as a testbed. It resides in the 'Information Bottleneck Efficiency in Semantic Systems' leaf, which contains only two papers total. This is a notably sparse research direction within the broader taxonomy of 50 papers across 36 topics, suggesting the specific intersection of IB-efficiency and LLM semantic categorization remains relatively unexplored compared to more crowded areas like conceptual representation alignment or preference-based methods.

The taxonomy reveals neighboring work in 'Abstraction Hierarchy and Granularity' examining hierarchical concept organization, and broader branches in 'Conceptual Representation Alignment' focusing on object concepts and semantic networks. The sibling paper 'Tokens to thoughts' explores transformation from token-level processing to conceptual units, emphasizing compression principles but not specifically targeting human-aligned category boundaries through cultural evolution paradigms. The taxonomy's scope notes clarify that this leaf excludes general conceptual alignment without compression analysis, positioning the work at a distinct methodological intersection between information theory and cognitive alignment.

Among 25 candidates examined, the IICLL paradigm contribution shows potential overlap, with 2 refutable candidates identified from 10 examined. The demonstration of human-like IB-efficiency bias appears more novel, with 0 refutable candidates among 5 examined. The theoretical framework contribution similarly shows no clear refutation across 10 candidates. The limited search scope means these statistics reflect top-K semantic matches and citation expansion, not exhaustive coverage. The IICLL paradigm's higher refutation rate suggests iterative in-context learning methods may have precedents, while the IB-efficiency bias finding appears less anticipated in prior work.

Based on the limited 25-candidate search, the work appears to occupy a relatively sparse position combining IB theory with LLM categorization dynamics. The taxonomy structure confirms this intersection remains underpopulated compared to adjacent areas. However, the analysis cannot rule out relevant work outside the semantic search scope, particularly in cognitive science or information theory venues that may not surface through LLM-centric queries. The contribution-level statistics suggest differential novelty across claims, with methodological aspects potentially more incremental than theoretical insights.

Taxonomy

Core-task Taxonomy Papers
34
3
Claimed Contributions
25
Contribution Candidate Papers Compared
2
Refutable Paper

Research Landscape Overview

Core task: emergence of human-aligned semantic categorization in large language models. The field examines how LLMs develop category structures that mirror human conceptual organization, spanning multiple complementary perspectives. Conceptual Representation Alignment investigates whether models form mental-like representations similar to human cognition, with works like Human-like object concepts[1] and Conceptual representations prediction[4] exploring object-level and neural alignment. Information-Theoretic and Compression-Based Categorization applies principles from information bottleneck theory to understand efficient semantic compression, while Preference and Value Alignment focuses on moral and normative dimensions. Theoretical Frameworks probe cognitive mechanisms underlying categorization, and Computational Ingredients dissect architectural components enabling alignment. Domain-Specific Semantic Applications demonstrate categorization in specialized contexts like biomedical language, whereas Evaluation Methodologies develop benchmarks such as HelloBench[3] to measure alignment quality. Interactive and Strategic Alignment examines categorization in multi-agent and game-theoretic settings. Several active lines reveal key tensions between mechanistic understanding and practical alignment. Information-theoretic approaches emphasize compression efficiency and minimal sufficient statistics, contrasting with representation-focused work that seeks direct neural or conceptual correspondence with human brain activity. Human-aligned categorization[0] sits within the information-theoretic branch alongside Tokens to thoughts[21], both examining how models distill semantic structure through compression principles. While Tokens to thoughts[21] explores the transformation from token-level processing to higher-order conceptual units, Human-aligned categorization[0] specifically targets the bottleneck efficiency that enables human-like category boundaries to emerge. This contrasts with alignment work in neighboring branches like Cognitive alignment multimodal[17] or Conceptual groupings embeddings[33], which prioritize matching human similarity judgments or embedding geometries rather than compression-driven emergence. The central open question remains whether efficient information processing alone suffices to produce human-aligned categories, or whether additional cognitive constraints are necessary.

Claimed Contributions

Iterated in-Context Language Learning (IICLL) paradigm

The authors introduce IICLL, a novel method that adapts iterated learning to LLMs by simulating cultural transmission of category systems through in-context learning. This paradigm enables direct comparison of LLMs' inductive biases with human behavioral experiments in language learning.

10 retrieved papers
Can Refute
Demonstration that LLMs exhibit human-like inductive bias toward IB-efficiency

The authors demonstrate through IICLL experiments that LLMs iteratively restructure initially random artificial color naming systems toward greater Information Bottleneck efficiency and increased human alignment, revealing an underlying bias similar to humans rather than mere pattern memorization.

5 retrieved papers
Theoretical framework for studying semantic systems in LLMs via Information Bottleneck

The authors propose a framework that applies the Information Bottleneck principle to evaluate whether LLMs can develop efficient, human-aligned semantic category systems. This framework enables systematic assessment of LLMs' complexity-accuracy tradeoffs in categorization tasks compared to human languages.

10 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

Iterated in-Context Language Learning (IICLL) paradigm

The authors introduce IICLL, a novel method that adapts iterated learning to LLMs by simulating cultural transmission of category systems through in-context learning. This paradigm enables direct comparison of LLMs' inductive biases with human behavioral experiments in language learning.

Contribution

Demonstration that LLMs exhibit human-like inductive bias toward IB-efficiency

The authors demonstrate through IICLL experiments that LLMs iteratively restructure initially random artificial color naming systems toward greater Information Bottleneck efficiency and increased human alignment, revealing an underlying bias similar to humans rather than mere pattern memorization.

Contribution

Theoretical framework for studying semantic systems in LLMs via Information Bottleneck

The authors propose a framework that applies the Information Bottleneck principle to evaluate whether LLMs can develop efficient, human-aligned semantic category systems. This framework enables systematic assessment of LLMs' complexity-accuracy tradeoffs in categorization tasks compared to human languages.