PMark: Towards Robust and Distortion-free Semantic-level Watermarking with Channel Constraints

ICLR 2026 Conference SubmissionAnonymous Authors
Semantic-level Watermark; Text Watermark; AI Security
Abstract:

Semantic-level watermarking (SWM) for large language models (LLMs) enhances watermarking robustness against text modifications and paraphrasing attacks by treating the sentence as the fundamental unit. However, existing methods still lack strong theoretical guarantees of robustness, and reject-sampling–based generation often introduces significant distribution distortions compared with unwatermarked outputs. In this work, we introduce a new theoretical framework on SWM through the concept of proxy functions (PFs) -- functions that map sentences to scalar values. Building on this framework, we propose PMark, a simple yet powerful SWM method that estimates the PF median for the next sentence dynamically through sampling while enforcing multiple PF constraints (which we call channels) to strengthen watermark evidence. Equipped with solid theoretical guarantees, PMark achieves the desired distortion-free property and improves the robustness against paraphrasing-style attacks. We also provide an empirically optimized version that further removes the requirement for dynamical median estimation for better sampling efficiency. Experimental results show that PMark consistently outperforms existing SWM baselines in both text quality and robustness, offering a more effective paradigm for detecting machine-generated text. The source code is available at https://anonymous.4open.science/r/PMark.

Disclaimer
This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.
NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.
If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper proposes PMark, a semantic-level watermarking method built on a theoretical framework of proxy functions that map sentences to scalar values. It sits within the 'Semantic Invariance and Proxy Functions' leaf of the taxonomy, which contains only three papers total. This represents a relatively sparse research direction compared to more crowded areas like token-level logit manipulation, suggesting the paper targets a less saturated niche focused on sentence-based watermarking with formal semantic guarantees.

The taxonomy reveals that semantic-level watermarking divides into three main approaches: proxy functions (this paper's leaf), sentence-level similarity/hashing, and topic-aware methods. Neighboring leaves address related but distinct challenges—similarity-based methods leverage embeddings for detection, while topic-aware techniques incorporate contextual information. The paper's proxy function framework appears to bridge theoretical optimization (a separate branch with formal guarantees) and semantic robustness, positioning it at the intersection of multiple research threads within the generation mechanisms category.

Among eighteen candidates examined across three contributions, the multi-channel constraint mechanism shows the most substantial prior overlap, with two refutable candidates identified from eight examined. The theoretical proxy function framework and the PMark method itself appear more novel, with zero refutable candidates among nine and one examined respectively. This suggests the core formalism and implementation may be relatively fresh, while the idea of using multiple constraints for robustness has closer precedents in the limited search scope.

Based on the top-eighteen semantic matches examined, the work appears to introduce a distinct theoretical angle within a sparsely populated research direction. The analysis does not cover the full breadth of watermarking literature, and the small candidate pool means potentially relevant work outside the semantic search radius may exist. The taxonomy structure indicates this is an emerging area with room for novel contributions, though the multi-channel mechanism overlaps with existing robustness strategies.

Taxonomy

Core-task Taxonomy Papers
50
3
Claimed Contributions
18
Contribution Candidate Papers Compared
2
Refutable Paper

Research Landscape Overview

Core task: semantic-level watermarking for large language models. The field has organized itself around several complementary branches that address different facets of embedding and verifying invisible signals in LLM-generated text. Watermark Generation Mechanisms explores how to inject marks during or after generation, ranging from token-level biasing to sentence-based semantic transformations that preserve meaning while encoding information. Detection and Verification methods develop statistical tests and learned classifiers to identify watermarked content, while Robustness and Attack Resistance investigates defenses against paraphrasing, adversarial edits, and spoofing attempts. Application-Specific Watermarking tailors schemes to domains such as code generation or personalized outputs, and Quality-Aware and Search-Based Optimization balances watermark strength against text fluency through search or scoring heuristics. Nested and Hierarchical Watermarking enables multi-level or multi-bit encoding, and Survey and Comparative Analysis provides overviews of the rapidly evolving landscape. Within the generation mechanisms, a particularly active line of work focuses on semantic-level and sentence-based approaches that modify text at a higher abstraction than individual tokens. PMark[0] exemplifies this direction by introducing proxy functions to maintain semantic invariance during watermark embedding, ensuring that paraphrases or meaning-preserving edits do not erase the signal. This contrasts with earlier token-biasing schemes and aligns closely with Semantic Invariant Watermark[6], which similarly emphasizes preserving semantics through carefully designed transformations. Robust Semantics Watermark[4] also operates in this space, exploring how to achieve both semantic fidelity and resilience to attacks. A central trade-off across these works is between the strength of the watermark signal and the risk of degrading text quality or introducing detectable artifacts, with PMark[0] addressing this challenge through its proxy-based optimization framework that balances imperceptibility and robustness.

Claimed Contributions

Theoretical framework for semantic-level watermarking via proxy functions

The authors introduce a theoretical framework that unifies existing semantic-level watermarking methods through the concept of proxy functions—functions that map sentences to scalar values. This framework provides analytical foundations for evaluating watermarking performance and enables formal analysis of distortion and robustness properties.

9 retrieved papers
Multi-channel constraint mechanism for enhanced robustness

The authors identify that sparse watermark evidence in existing semantic-level watermarking methods weakens robustness against attacks. They address this by introducing multiple channel constraints (using orthogonal pivot vectors) to increase the density of watermark evidence, thereby improving robustness against paraphrasing and word-level attacks.

8 retrieved papers
Can Refute
PMark: distortion-free semantic watermarking with online and offline variants

The authors propose PMark, a semantic-level watermarking method with two variants: an online version that dynamically estimates the proxy function median and is theoretically distortion-free, and an offline version that uses a prior median assumption (zero) to reduce computational cost while maintaining low distortion. Both variants enforce multiple channel constraints to strengthen watermark evidence.

1 retrieved paper

Core Task Comparisons

Comparisons with papers in the same taxonomy category

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

Theoretical framework for semantic-level watermarking via proxy functions

The authors introduce a theoretical framework that unifies existing semantic-level watermarking methods through the concept of proxy functions—functions that map sentences to scalar values. This framework provides analytical foundations for evaluating watermarking performance and enables formal analysis of distortion and robustness properties.

Contribution

Multi-channel constraint mechanism for enhanced robustness

The authors identify that sparse watermark evidence in existing semantic-level watermarking methods weakens robustness against attacks. They address this by introducing multiple channel constraints (using orthogonal pivot vectors) to increase the density of watermark evidence, thereby improving robustness against paraphrasing and word-level attacks.

Contribution

PMark: distortion-free semantic watermarking with online and offline variants

The authors propose PMark, a semantic-level watermarking method with two variants: an online version that dynamically estimates the proxy function median and is theoretically distortion-free, and an offline version that uses a prior median assumption (zero) to reduce computational cost while maintaining low distortion. Both variants enforce multiple channel constraints to strengthen watermark evidence.