LLMs Can Hide Text in Other Text of the Same Length

ICLR 2026 Conference SubmissionAnonymous Authors

OpenReview Score: 7.0 Download Report PDF

Large Language Models (LLMs)Generative SteganographyAI SafetyAuthorial IntentTrust in AIDeniabilityCensorship Resistance

A meaningful text can be hidden inside another, completely different yet still coherent and plausible, text of the same length. For example, a tweet containing a harsh political critique could be embedded in a tweet that celebrates the same political leader, or an ordinary product review could conceal a secret manuscript. This uncanny state of affairs is now possible thanks to Large Language Models, and in this paper we present Calgacus, a simple and efficient protocol to achieve it. We show that even modest 8‑billion‑parameter open‑source LLMs are sufficient to obtain high‑quality results, and a message as long as this abstract can be encoded and decoded locally on a laptop in seconds. The existence of such a protocol demonstrates a radical decoupling of text from authorial intent, further eroding trust in written communication, already shaken by the rise of LLM chatbots. We illustrate this with a concrete scenario: a company could covertly deploy an unfiltered LLM by encoding its answers within the compliant responses of a safe model. This possibility raises urgent questions for AI safety and challenges our understanding of what it means for a Large Language Model to know something.

Abstract:

Disclaimer

This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.

NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.

If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Taxonomy

Core-task Taxonomy Papers

Claimed Contributions

Contribution Candidate Papers Compared

Refutable Paper

Research Landscape Overview

Core task: steganographic text encoding using large language models. The field has grown rapidly around the challenge of embedding secret information into LLM-generated or LLM-modified text while preserving naturalness and evading detection. The taxonomy reveals a diverse landscape organized into eleven major branches. Generative approaches (e.g., Generative Text Steganography[2], StegGPT[10]) produce cover text from scratch by controlling token sampling, while edit-based methods (e.g., Edit-based Linguistic[27]) modify existing text to encode messages. Robustness and security mechanisms address adversarial perturbations and detection resistance (Robust Steganography[3], Undetectable Steganography[13]), and black-box strategies (Black-box Steganography[4]) operate without direct model access. Fine-tuning and alignment branches explore training-time embedding, watermarking methods focus on output attribution (Codable Text Watermarking[20], Token-Specific Watermarking[35]), and steganalysis develops detectors (Detective[28]). Covert communication protocols (Covert Prompt Transmission[6], Dead-drop Deployments[33]) and adversarial attacks (Backdoor Attacks[32]) round out the operational and threat-modeling dimensions, while specialized applications target domain-specific scenarios. Recent work highlights tensions between capacity, imperceptibility, and robustness. Generative methods often achieve high naturalness but face challenges in capacity and resistance to paraphrasing, whereas edit-based and black-box approaches trade off fluency for practical deployment flexibility. The original paper, Hide Text Same Length[0], sits within the Specialized Applications branch alongside works like Encryption Covert Channel[9] and Telecom Fraud Recognition[50]. Unlike broader generative or edit-based frameworks, Hide Text Same Length[0] emphasizes a domain-specific constraint—preserving exact text length—which suggests a focus on scenarios where format consistency is critical, such as constrained communication channels or specific application protocols. This contrasts with more general-purpose methods like Generative Text Steganography[2] or Black-box Steganography[4], positioning it as a niche solution addressing particular operational requirements rather than advancing core encoding paradigms.

Claimed Contributions

Calgacus protocol for hiding text in text of the same length

10 retrieved papers

The authors introduce Calgacus, a steganographic protocol that uses Large Language Models to encode an arbitrary meaningful text within a different well-formed and plausible text of exactly the same token length. The method is efficient and works with modest open-source LLMs on consumer hardware.

10 retrieved papers

Demonstration of radical decoupling of text from authorial intent

Can Refute

10 retrieved papers

The authors argue that their protocol reveals a fundamental shift in the nature of written communication, showing that coherent text can be generated without reflecting the author's true intentions, thereby challenging trust in written communication and the meaning of LLM-generated content.

10 retrieved papers

Can Refute

Concrete scenario illustrating AI safety implications

Can Refute

10 retrieved papers

The authors present a practical application where an AI company could use the protocol to hide uncensored responses from a powerful unfiltered LLM within the compliant outputs of an aligned model, raising urgent questions for AI safety and challenging notions of what it means for an LLM to possess knowledge.

10 retrieved papers

Can Refute

Core Task Comparisons

Comparisons with papers in the same taxonomy category

[9] Encryption based covert channel for large language models PDF

Y Wang (2024)

[50] Telecom Fraud Recognition Based on Large Language Model Neuron Selection PDF

Jiang Lanlan, Cheng Zhang, Xingguo Qin, Ya Zhou, Guanglun Huang, Hui Li, Jun Li (2025) • Mathematics

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

Calgacus protocol for hiding text in text of the same length

[1] The Steganographic Potentials of Language Models PDF

Cannot Refute

[2] Generative text steganography with large language model PDF

Cannot Refute

[3] Robust Steganography from Large Language Models PDF

Cannot Refute

[4] Black-box Steganography for Large Language Models PDF

Cannot Refute

[7] Multi-Classification of Linguistic Steganography Driven by Large Language Models PDF

Cannot Refute

[12] DeepStego: Privacy-Preserving Natural Language Steganography Using Large Language Models and Advanced Neural Architectures PDF

Cannot Refute

[13] Undetectable Steganography for Language Models PDF

Cannot Refute

[71] A watermark for large language models PDF

Cannot Refute

[72] A character based steganography using masked language modeling PDF

Cannot Refute

[73] Minimizing distortion in steganography via adaptive language model tuning PDF

Cannot Refute

Contribution

Demonstration of radical decoupling of text from authorial intent

[64] Distant Writing: Literary Production in the Age of Artificial Intelligence PDF

Can Refute

[61] From intentions to techniques: A comprehensive taxonomy and challenges in text watermarking for large language models PDF

Cannot Refute

[62] Be Sure to Use the Same Writing Style: Applying Authorship Verification on Large-Language-Model-Generated Texts PDF

Cannot Refute

[63] Decoupling Content and Expression: Two-Dimensional Detection of AI-Generated Text PDF

Cannot Refute

[65] Step-by-step: Separating planning from realization in neural data-to-text generation PDF

Cannot Refute

[66] The hermeneutics of computer-generated texts PDF

Cannot Refute

[67] Can factual statements be deceptive? the DeFaBel corpus of belief-based deception PDF

Cannot Refute

[68] Fine-grained sentiment controlled text generation PDF

Cannot Refute

[69] Mutual disentanglement learning for joint fine-grained sentiment classification and controllable text generation PDF

Cannot Refute

[70] Text interpretation: hermeneutic approach PDF

Cannot Refute

Contribution

Concrete scenario illustrating AI safety implications

[51] Shadow alignment: The ease of subverting safely-aligned language models PDF

Can Refute

[59] Fine-tuning aligned language models compromises safety, even when users do not intend to! PDF

Can Refute

[52] Universal and Transferable Adversarial Attacks on Aligned Language Models PDF

Cannot Refute

[53] Beear: Embedding-based adversarial removal of safety backdoors in instruction-tuned language models PDF

Cannot Refute

[54] Prompt, Divide, and Conquer: Bypassing Large Language Model Safety Filters via Segmented and Distributed Prompt Processing PDF

Cannot Refute

[55] Safeguarding large language models: A survey PDF

Cannot Refute

[56] Auditing language models for hidden objectives PDF

Cannot Refute

[57] Fundamental limitations of alignment in large language models PDF

Cannot Refute

[58] Visual Adversarial Examples Jailbreak Aligned Large Language Models PDF

Cannot Refute

[60] Jailbreaking and mitigation of vulnerabilities in large language models PDF

Cannot Refute

LLMs Can Hide Text in Other Text of the Same Length

Overview

Taxonomy

Research Landscape Overview

Claimed Contributions

Core Task Comparisons

[9] Encryption based covert channel for large language models PDF

[50] Telecom Fraud Recognition Based on Large Language Model Neuron Selection PDF

Contribution Analysis

Calgacus protocol for hiding text in text of the same length

[1] The Steganographic Potentials of Language Models PDF

[2] Generative text steganography with large language model PDF

[3] Robust Steganography from Large Language Models PDF

[4] Black-box Steganography for Large Language Models PDF

[7] Multi-Classification of Linguistic Steganography Driven by Large Language Models PDF

[12] DeepStego: Privacy-Preserving Natural Language Steganography Using Large Language Models and Advanced Neural Architectures PDF

[13] Undetectable Steganography for Language Models PDF

[71] A watermark for large language models PDF

[72] A character based steganography using masked language modeling PDF

[73] Minimizing distortion in steganography via adaptive language model tuning PDF

Demonstration of radical decoupling of text from authorial intent

[64] Distant Writing: Literary Production in the Age of Artificial Intelligence PDF

[61] From intentions to techniques: A comprehensive taxonomy and challenges in text watermarking for large language models PDF

[62] Be Sure to Use the Same Writing Style: Applying Authorship Verification on Large-Language-Model-Generated Texts PDF

[63] Decoupling Content and Expression: Two-Dimensional Detection of AI-Generated Text PDF

[65] Step-by-step: Separating planning from realization in neural data-to-text generation PDF

[66] The hermeneutics of computer-generated texts PDF

[67] Can factual statements be deceptive? the DeFaBel corpus of belief-based deception PDF

[68] Fine-grained sentiment controlled text generation PDF

[69] Mutual disentanglement learning for joint fine-grained sentiment classification and controllable text generation PDF

[70] Text interpretation: hermeneutic approach PDF

Concrete scenario illustrating AI safety implications

[51] Shadow alignment: The ease of subverting safely-aligned language models PDF

[59] Fine-tuning aligned language models compromises safety, even when users do not intend to! PDF

[52] Universal and Transferable Adversarial Attacks on Aligned Language Models PDF

[53] Beear: Embedding-based adversarial removal of safety backdoors in instruction-tuned language models PDF

[54] Prompt, Divide, and Conquer: Bypassing Large Language Model Safety Filters via Segmented and Distributed Prompt Processing PDF

[55] Safeguarding large language models: A survey PDF

[56] Auditing language models for hidden objectives PDF

[57] Fundamental limitations of alignment in large language models PDF

[58] Visual Adversarial Examples Jailbreak Aligned Large Language Models PDF

[60] Jailbreaking and mitigation of vulnerabilities in large language models PDF

Table of Contents