On the Thinking-Language Modeling Gap in Large Language Models

ICLR 2026 Conference SubmissionAnonymous Authors
LLMReasoningStructural Causal Models
Abstract:

Large Language Models (LLMs) demonstrate remarkable capabilities in solving complicated reasoning tasks by imitating the human thinking process from human languages. However, even the most capable LLMs can still fail in tasks that are simple for humans. To understand the gap, we construct structural causal models of next-token predictors in human languages. As language is primarily a tool for humans to share knowledge instead of thinking, modeling human thinking from languages can integrate language expression biases into LLMs. More specifically, we show that LLMs can fail to understand implicit expressions -- expression patterns occur less frequently during training. Consequently, LLMs can easily overlook critical information when biased by implicit expressions. We verify our theoretical claims with carefully constructed realistic datasets containing implicit expressions. Furthermore, we also propose a prompt-level intervention to instruct LLMs to carefully expand and focus on all the expressions available. The empirical success of the prompt-level intervention across 11 tasks and 4 representative LLMs, along with the improvements over general reasoning tasks, reaffirms our findings.

Disclaimer
This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.
NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.
If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper proposes structural causal models to explain LLM failures on implicit expressions, arguing that language-as-communication introduces biases when modeling human thinking. It occupies the 'Implicit Expression and Thinking-Language Gap' leaf within the 'Expression and Representation Biases' branch. Notably, this leaf contains only the original paper itself—no sibling papers exist in this specific category. The taxonomy shows this is a relatively sparse research direction compared to more crowded areas like 'Social and Demographic Bias in Reasoning', which contains four distinct leaf nodes examining persona effects, implicit associations, logic puzzles, and narrative-based biases.

The taxonomy reveals neighboring work in sibling leaves: 'Semantic Representations and World Models' examines contextual representations as discourse models, 'Implicit Causality and Discourse Continuations' studies coreference bias in text generation, and 'Social Bias Frames and Pragmatic Implicatures' addresses implied meanings. These directions focus on representational properties or pragmatic inference rather than the thinking-language gap framed through causal modeling. The parent branch 'Expression and Representation Biases' excludes token-level statistical patterns and social demographic biases, positioning this work at the intersection of linguistic expression and cognitive modeling rather than surface-level or identity-based bias.

Among thirty candidates examined, none clearly refuted any of the three contributions: structural causal models for next-token prediction (ten candidates, zero refutations), formalization of implicit expressions (ten candidates, zero refutations), and the Language-of-Thoughts prompt intervention (ten candidates, zero refutations). This suggests that within the limited search scope, the specific combination of causal modeling, implicit expression formalization, and prompt-level intervention appears relatively unexplored. However, the search examined top-K semantic matches and citations, not an exhaustive literature review, so related work outside this scope may exist.

Based on the limited search of thirty candidates and the sparse taxonomy position, the work appears to occupy a distinct niche. The absence of sibling papers and zero refutations across contributions suggest novelty within the examined scope, though the small search scale means potentially relevant work in causal inference, prompt engineering, or linguistic bias may not have been captured. The taxonomy structure indicates this is an emerging rather than saturated research direction.

Taxonomy

Core-task Taxonomy Papers
14
3
Claimed Contributions
30
Contribution Candidate Papers Compared
0
Refutable Paper

Research Landscape Overview

Core task: Understanding and mitigating implicit expression bias in language model reasoning. The field structure reflects a multifaceted approach to how language models encode and propagate subtle biases that are not overtly stated but emerge through representational choices and reasoning patterns. The taxonomy organizes work into several main branches: detection and characterization of implicit biases, intervention strategies to reduce these biases, domain-specific applications where such biases manifest critically, the role of reward modeling and training dynamics in shaping bias, and broader philosophical foundations. Detection-focused work such as Token Bias Peek[3] and Implicit Bias Patterns[4] examines how models reveal hidden preferences through token-level choices and structural patterns, while mitigation branches explore techniques ranging from prompt engineering to architectural interventions. Domain-specific studies like Cognitive Biases Medical[7] highlight how implicit biases can have serious consequences in specialized contexts, and theoretical work such as Interrogative Moves Reasoning[13] provides conceptual grounding for understanding reasoning processes. A particularly active line of inquiry centers on the gap between what models internally represent and what they explicitly express, often termed the thinking-language divide. Thinking Language Gap[0] sits squarely within this cluster, investigating how implicit expression biases emerge when models reason internally in ways that differ from their surface outputs. This work closely relates to Token Bias Peek[3], which examines token-level manifestations of hidden preferences, and Implicit Bias Patterns[4], which characterizes recurring structural biases across reasoning traces. Meanwhile, studies like Bias Runs Deep[1] and Implicit Meaning Representations[2] explore how biases are embedded at deeper representational levels, suggesting that surface-level interventions may be insufficient. The central tension across these branches involves whether implicit biases are primarily artifacts of training data, emergent properties of model architecture, or fundamental limitations in aligning internal reasoning with external expression—a question that shapes both detection methodologies and mitigation strategies.

Claimed Contributions

Structural Causal Models for LLM Next-Token Prediction on Human Language

The authors develop structural causal models (SCMs) that formalize how LLMs learn to reason from human language through next-token prediction. These models instantiate the intermediate mechanism between thinking (latent variables) and language expressions (observed tokens), revealing how language expression biases can be integrated into LLMs during training.

10 retrieved papers
Formalization of Implicit Expressions and Language-Thought Gap

The authors formalize the concept of implicit expressions (patterns that occur infrequently during training) and prove theoretically (Theorem 2.4) that LLMs can overlook critical information when biased by such expressions, even with perfect knowledge. This establishes the language-thought gap where reasoning biases emerge from the mismatch between language expressions and underlying thought processes.

10 retrieved papers
Language-of-Thoughts (LoT) Prompt-Level Intervention

The authors introduce a prompting technique called Language-of-Thoughts (LoT) that instructs LLMs to observe, expand, and echo all relevant information. This intervention is designed to mitigate language modeling biases by improving both the context and expression understanding, thereby alleviating biased reasoning caused by implicit expressions.

10 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

Within the taxonomy built over the current TopK core-task papers, the original paper is assigned to a leaf with no direct siblings and no cousin branches under the same grandparent topic. In this retrieved landscape, it appears structurally isolated, which is one partial signal of novelty, but still constrained by search coverage and taxonomy granularity.

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

Structural Causal Models for LLM Next-Token Prediction on Human Language

The authors develop structural causal models (SCMs) that formalize how LLMs learn to reason from human language through next-token prediction. These models instantiate the intermediate mechanism between thinking (latent variables) and language expressions (observed tokens), revealing how language expression biases can be integrated into LLMs during training.

Contribution

Formalization of Implicit Expressions and Language-Thought Gap

The authors formalize the concept of implicit expressions (patterns that occur infrequently during training) and prove theoretically (Theorem 2.4) that LLMs can overlook critical information when biased by such expressions, even with perfect knowledge. This establishes the language-thought gap where reasoning biases emerge from the mismatch between language expressions and underlying thought processes.

Contribution

Language-of-Thoughts (LoT) Prompt-Level Intervention

The authors introduce a prompting technique called Language-of-Thoughts (LoT) that instructs LLMs to observe, expand, and echo all relevant information. This intervention is designed to mitigate language modeling biases by improving both the context and expression understanding, thereby alleviating biased reasoning caused by implicit expressions.