LLMs Can Hide Text in Other Text of the Same Length
Overview
Taxonomy
Research Landscape Overview
Claimed Contributions
The authors introduce Calgacus, a steganographic protocol that uses Large Language Models to encode an arbitrary meaningful text within a different well-formed and plausible text of exactly the same token length. The method is efficient and works with modest open-source LLMs on consumer hardware.
The authors argue that their protocol reveals a fundamental shift in the nature of written communication, showing that coherent text can be generated without reflecting the author's true intentions, thereby challenging trust in written communication and the meaning of LLM-generated content.
The authors present a practical application where an AI company could use the protocol to hide uncensored responses from a powerful unfiltered LLM within the compliant outputs of an aligned model, raising urgent questions for AI safety and challenging notions of what it means for an LLM to possess knowledge.
Core Task Comparisons
Comparisons with papers in the same taxonomy category
Contribution Analysis
Detailed comparisons for each claimed contribution
Calgacus protocol for hiding text in text of the same length
The authors introduce Calgacus, a steganographic protocol that uses Large Language Models to encode an arbitrary meaningful text within a different well-formed and plausible text of exactly the same token length. The method is efficient and works with modest open-source LLMs on consumer hardware.
[1] The Steganographic Potentials of Language Models PDF
[2] Generative text steganography with large language model PDF
[3] Robust Steganography from Large Language Models PDF
[4] Black-box Steganography for Large Language Models PDF
[7] Multi-Classification of Linguistic Steganography Driven by Large Language Models PDF
[12] DeepStego: Privacy-Preserving Natural Language Steganography Using Large Language Models and Advanced Neural Architectures PDF
[13] Undetectable Steganography for Language Models PDF
[71] A watermark for large language models PDF
[72] A character based steganography using masked language modeling PDF
[73] Minimizing distortion in steganography via adaptive language model tuning PDF
Demonstration of radical decoupling of text from authorial intent
The authors argue that their protocol reveals a fundamental shift in the nature of written communication, showing that coherent text can be generated without reflecting the author's true intentions, thereby challenging trust in written communication and the meaning of LLM-generated content.
[64] Distant Writing: Literary Production in the Age of Artificial Intelligence PDF
[61] From intentions to techniques: A comprehensive taxonomy and challenges in text watermarking for large language models PDF
[62] Be Sure to Use the Same Writing Style: Applying Authorship Verification on Large-Language-Model-Generated Texts PDF
[63] Decoupling Content and Expression: Two-Dimensional Detection of AI-Generated Text PDF
[65] Step-by-step: Separating planning from realization in neural data-to-text generation PDF
[66] The hermeneutics of computer-generated texts PDF
[67] Can factual statements be deceptive? the DeFaBel corpus of belief-based deception PDF
[68] Fine-grained sentiment controlled text generation PDF
[69] Mutual disentanglement learning for joint fine-grained sentiment classification and controllable text generation PDF
[70] Text interpretation: hermeneutic approach PDF
Concrete scenario illustrating AI safety implications
The authors present a practical application where an AI company could use the protocol to hide uncensored responses from a powerful unfiltered LLM within the compliant outputs of an aligned model, raising urgent questions for AI safety and challenging notions of what it means for an LLM to possess knowledge.