DP-Fusion: Token-Level Differentially Private Inference for Large Language Models

ICLR 2026 Conference SubmissionAnonymous Authors

OpenReview Score: 6.0 Download Report PDF

PrivacyLarge Language ModelsDocument Privatization

Large language models (LLMs) do not preserve privacy at inference-time. The LLM's outputs can inadvertently reveal information about the model's context, which presents a privacy challenge when the LLM is augmented via tools or databases containing sensitive information. Existing privacy-preserving methods at inference-time have significant limitations since they (i) lack provable guarantees or (ii) have a poor utility/privacy trade-off. We propose DP-Fusion, a Differentially Private Inference (DPI) mechanism for LLMs that provably bounds the influence a set of tokens in the context can have on the LLM's output. DP-Fusion works as follows: (1) label a subset of sensitive tokens, (2) infer the LLM without any sensitive tokens to obtain a baseline, (3) infer the LLM with the sensitive tokens, and (4) blend distributions so that the final output remains within a bounded distance of the baseline distribution. While this per-token influence bound also mitigates jailbreak-style prompt injection, we focus on document privatization, where the goal is to paraphrase a document containing sensitive tokens, e.g., personally identifiable information, so that no attacker can reliably infer them from the paraphrased document while preserving high text quality. The privacy/utility trade-off is controlled by $\epsilon$ , where $\epsilon=0$ hides sensitive tokens entirely, while higher values trade off privacy for improved text quality. We show that our method creates token-level provably privatized documents with substantially improved theoretical and empirical privacy, achieving $6\times$ lower perplexity than related DPI methods.

Abstract:

Disclaimer

This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.

NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.

If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper introduces DP-Fusion, a mechanism for differentially private LLM inference that bounds the influence of sensitive tokens on generated outputs. It resides in the 'Differential Privacy for Next-Token Prediction' leaf, which contains five papers total, including the original work. This leaf sits within the broader 'Privacy-Preserving Inference Mechanisms' branch, indicating a moderately populated research direction focused on runtime privacy protections. The taxonomy shows this is an active but not overcrowded area, with sibling papers exploring related noise-injection and decoding strategies for autoregressive generation.

The taxonomy reveals that DP-Fusion's leaf is one of four under Privacy-Preserving Inference Mechanisms, alongside 'Privacy-Preserving In-Context Learning and Prompting' (five papers), 'Cryptographic and Secure Computation for Inference' (four papers), and 'Instance Obfuscation and Masking for Inference Privacy' (two papers). These neighboring leaves address complementary challenges: protecting prompts and exemplars, leveraging cryptographic primitives, or perturbing inputs rather than outputs. The scope note for the parent branch explicitly excludes training-time privacy, clarifying that DP-Fusion's focus on inference-time token influence bounds distinguishes it from adaptation or fine-tuning methods found in other taxonomy branches.

Among twenty-three candidates examined across three contributions, no refutable prior work was identified. The core DP-Fusion mechanism examined ten candidates with zero refutations, the document privatization application examined ten candidates with zero refutations, and the per-group privacy budget framework examined three candidates with zero refutations. This limited search scope—top-K semantic matches plus citation expansion—suggests that within the examined set, the fusion-based approach and per-token influence bounding appear distinct from prior noise-injection or decoding strategies. However, the analysis does not claim exhaustive coverage of all differential privacy inference techniques in the broader literature.

Based on the limited search of twenty-three candidates, DP-Fusion appears to occupy a recognizable niche within differential privacy for next-token prediction, with no clear overlap detected in the examined set. The taxonomy context indicates a moderately active research direction with established sibling work on noise calibration and adaptive budgets, suggesting the paper builds on known challenges in balancing privacy and utility during autoregressive generation. The absence of refutations in this scope does not preclude related work outside the top-K matches or in adjacent taxonomy leaves.

Taxonomy

Core-task Taxonomy Papers

Claimed Contributions

Contribution Candidate Papers Compared

Refutable Paper

Research Landscape Overview

Core task: differentially private inference for large language models. The field addresses how to deploy LLMs while protecting sensitive information in user queries, model outputs, or training data. The taxonomy reveals a broad landscape organized around several complementary themes. Privacy-Preserving Inference Mechanisms focus on runtime protections during model serving, including techniques that add noise to next-token predictions or embeddings. Privacy-Preserving Model Adaptation and Fine-Tuning explore how to customize models on private data without leaking individual records, often via differentially private training or parameter-efficient methods. Privacy-Preserving Data Generation and Sharing examine synthetic text creation and secure data exchange protocols. Privacy Risk Analysis and Attack Methods investigate vulnerabilities such as membership inference and prompt extraction, while Comprehensive Privacy Frameworks and Multi-Technique Approaches combine cryptographic, differential privacy, and federated learning tools. Surveys and Overviews synthesize these directions, Domain-Specific Privacy Applications target sectors like healthcare, Distributed and Federated Privacy-Preserving LLM Systems address decentralized settings, and Conceptual and Theoretical Privacy Perspectives provide foundational analysis. Within Privacy-Preserving Inference Mechanisms, a particularly active line of work targets differential privacy for next-token prediction, where the challenge is to add calibrated noise to autoregressive generation without destroying output quality. DP-Fusion[0] sits squarely in this cluster, proposing a fusion-based approach to balance privacy and utility during token sampling. It shares thematic ground with Private Decoding[5], which introduced early noise-injection strategies for decoding, and Private Next-Token[7], which refined noise calibration for sequential predictions. Nearby efforts like Submix[6] and Adaptively Private Prediction[13] explore alternative noise mechanisms and adaptive privacy budgets, highlighting ongoing trade-offs between tight privacy guarantees and coherent text generation. These works collectively grapple with the tension between strong formal privacy and the autoregressive nature of LLMs, a challenge that distinguishes inference-time protections from training-time or data-sharing approaches elsewhere in the taxonomy.

Claimed Contributions

DP-FUSION mechanism for token-level differentially private LLM inference

10 retrieved papers

The authors introduce DP-FUSION, a novel differentially private inference mechanism that provides provable token-level privacy guarantees for large language models. The method works by inferring the LLM with and without sensitive tokens, then blending the output distributions to bound the influence of sensitive tokens on generated outputs.

10 retrieved papers

Document privatization application with improved privacy-utility trade-off

10 retrieved papers

The authors apply DP-FUSION to document privatization, demonstrating that their method can paraphrase documents containing personally identifiable information while achieving substantially better privacy-utility trade-offs than existing methods, with 6× lower perplexity than related DPI approaches.

10 retrieved papers

Per-group privacy budget framework with parallelizable inference

3 retrieved papers

The authors develop a framework that allows assigning different privacy budgets to different groups of sensitive tokens and implements a parallelizable inference procedure that computes multiple distributions (one public and multiple private) per generation step, enabling efficient token-level privacy control.

3 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

[5] Differentially Private Decoding in Large Language Models PDF

Majmudar, Jimit (2022)

[6] Submix: Practical private prediction for large-scale language models PDF

Ginart Antonio, van der Maaten, Laurens, Antonio A. Ginart, Zou, James, L. Maaten, Guo Chuan, James Y. Zou, Chuan Guo (2022)

[7] Differentially Private Next-Token Prediction of Large Language Models PDF

Annavaram, Murali, Razaviyayn, Meisam (2024)

[13] Adaptively private next-token prediction of large language models PDF

Razaviyayn, Meisam, James Flemings, Annavaram, Murali, Meisam Razaviyayn, Murali Annavaram (2024)

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

DP-FUSION mechanism for token-level differentially private LLM inference

[7] Differentially Private Next-Token Prediction of Large Language Models PDF

Cannot Refute

[10] InferDPT: Privacy-preserving Inference for Black-box Large Language Models PDF

Cannot Refute

[19] Multi-tier privacy protection for large language models using differential privacy PDF

Cannot Refute

[23] PrivInfer: Privacy-Preserving Inference for Black-box Large Language Model PDF

Cannot Refute

[51] Privacy-preserving retrieval-augmented generation with differential privacy PDF

Cannot Refute

[52] DP-MLM: Differentially private text rewriting using masked language models PDF

Cannot Refute

[53] Hidden no more: Attacking and defending private third-party LLM inference PDF

Cannot Refute

[54] Split-and-denoise: Protect large language model inference with local differential privacy PDF

Cannot Refute

[55] Rag with differential privacy PDF

Cannot Refute

[56] Differential privacy in the era of generative AI: promises and challenges PDF

Cannot Refute

Contribution

Document privatization application with improved privacy-utility trade-off

[59] Mitigating the privacy issues in retrieval-augmented generation (rag) via pure synthetic data PDF

Cannot Refute

[60] Truthful text sanitization guided by inference attacks PDF

Cannot Refute

[61] Robust utility-preserving text anonymization based on large language models PDF

Cannot Refute

[62] Enhancing Privacy While Preserving Context in Text Transformations by Large Language Models PDF

Cannot Refute

[63] Idt: Dual-task adversarial rewriting for attribute anonymization PDF

Cannot Refute

[64] DP-VAE: Human-Readable Text Anonymization for Online Reviews with Differentially Private Variational Autoencoders PDF

Cannot Refute

[65] Tau-Eval: A Unified Evaluation Framework for Useful and Private Text Anonymization PDF

Cannot Refute

[66] Silencing the Risk, Not the Whistle: A Semi-automated Text Sanitization Tool for Mitigating the Risk of Whistleblower Re-Identification PDF

Cannot Refute

[67] Self-Refining Language Model Anonymizers via Adversarial Distillation PDF

Cannot Refute

[68] The text anonymization benchmark (tab): A dedicated corpus and evaluation framework for text anonymization PDF

Cannot Refute

Contribution

Per-group privacy budget framework with parallelizable inference

[19] Multi-tier privacy protection for large language models using differential privacy PDF

Cannot Refute

[57] PWC-MoE: Privacy-Aware Wireless Collaborative Mixture of Experts PDF

Cannot Refute

[58] Information-Theoretic Limits of Differentially Private Large Model Training PDF

Cannot Refute

DP-Fusion: Token-Level Differentially Private Inference for Large Language Models

Overview

Overall Novelty Assessment

Taxonomy

Research Landscape Overview

Claimed Contributions

Core Task Comparisons

[5] Differentially Private Decoding in Large Language Models PDF

[6] Submix: Practical private prediction for large-scale language models PDF

[7] Differentially Private Next-Token Prediction of Large Language Models PDF

[13] Adaptively private next-token prediction of large language models PDF

Contribution Analysis

DP-FUSION mechanism for token-level differentially private LLM inference

[7] Differentially Private Next-Token Prediction of Large Language Models PDF

[10] InferDPT: Privacy-preserving Inference for Black-box Large Language Models PDF

[19] Multi-tier privacy protection for large language models using differential privacy PDF

[23] PrivInfer: Privacy-Preserving Inference for Black-box Large Language Model PDF

[51] Privacy-preserving retrieval-augmented generation with differential privacy PDF

[52] DP-MLM: Differentially private text rewriting using masked language models PDF

[53] Hidden no more: Attacking and defending private third-party LLM inference PDF

[54] Split-and-denoise: Protect large language model inference with local differential privacy PDF

[55] Rag with differential privacy PDF

[56] Differential privacy in the era of generative AI: promises and challenges PDF

Document privatization application with improved privacy-utility trade-off

[59] Mitigating the privacy issues in retrieval-augmented generation (rag) via pure synthetic data PDF

[60] Truthful text sanitization guided by inference attacks PDF

[61] Robust utility-preserving text anonymization based on large language models PDF

[62] Enhancing Privacy While Preserving Context in Text Transformations by Large Language Models PDF

[63] Idt: Dual-task adversarial rewriting for attribute anonymization PDF

[64] DP-VAE: Human-Readable Text Anonymization for Online Reviews with Differentially Private Variational Autoencoders PDF

[65] Tau-Eval: A Unified Evaluation Framework for Useful and Private Text Anonymization PDF

[66] Silencing the Risk, Not the Whistle: A Semi-automated Text Sanitization Tool for Mitigating the Risk of Whistleblower Re-Identification PDF

[67] Self-Refining Language Model Anonymizers via Adversarial Distillation PDF

[68] The text anonymization benchmark (tab): A dedicated corpus and evaluation framework for text anonymization PDF

Per-group privacy budget framework with parallelizable inference

[19] Multi-tier privacy protection for large language models using differential privacy PDF

[57] PWC-MoE: Privacy-Aware Wireless Collaborative Mixture of Experts PDF

[58] Information-Theoretic Limits of Differentially Private Large Model Training PDF

Table of Contents