Secret-Protected Evolution for Differentially Private Synthetic Text Generation

ICLR 2026 Conference SubmissionAnonymous Authors

OpenReview Score: 5.5 Download Report PDF

synthetic datadifferential privacy

Text data has become extremely valuable on large language models (LLMs) and even lead to general artificial intelligence (AGI). A lot of high-quality text in the real world is private and cannot be freely used due to privacy concerns. Therefore, differentially private (DP) synthetic text generation has been proposed, aiming to produce high-utility synthetic data while protecting sensitive information. However, existing DP synthetic text generation imposes uniform guarantees that often overprotect non-sensitive content, resulting in substantial utility loss and computational overhead. Therefore, we propose Secret-Protected Evolution (SecPE), a novel framework that extends private evolution with secret-aware protection. Theoretically, we show that SecPE satisfies $(\vp, \vr)$ -secret protection, constituting a relaxation of Gaussian DP that enables tighter utility–privacy trade-offs, while also substantially reducing computational complexity relative to baseline methods. Empirically, across the OpenReview, PubMed, and Yelp benchmarks, SecPE consistently achieves lower Fréchet Inception Distance (FID) and higher downstream task accuracy than GDP-based Aug-PE baselines, while requiring less noise to attain the same level of protection. Our results highlight that secret-aware guarantees can unlock more practical and effective privacy-preserving synthetic text generation.

Abstract:

Disclaimer

This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.

NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.

If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper proposes Secret-Protected Evolution (SecPE), a framework that extends private evolution with secret-aware protection for differentially private synthetic text generation. It resides in the 'Secret-Aware and Selective DP' leaf under 'Evolutionary and Iterative DP Text Synthesis', which contains only two papers including this one. This places the work in a relatively sparse research direction within the broader field of differentially private text generation, suggesting that secret-aware evolutionary approaches remain underexplored compared to more established methods like DP fine-tuning or GAN-based synthesis.

The taxonomy reveals that SecPE's nearest neighbors include genetic and distribution-alignment methods in a sibling leaf, as well as DP fine-tuning approaches and private next-token prediction techniques in parallel branches. While evolutionary synthesis methods exist (e.g., genetic algorithms for distribution alignment), the secret-aware dimension distinguishes this work from uniform-privacy approaches. The framework diverges from end-to-end generative models and knowledge distillation techniques that apply global privacy budgets, instead targeting selective protection of sensitive content—a boundary explicitly noted in the taxonomy's scope definitions.

Among thirty candidates examined, the analysis found limited prior work overlap. The SecPE framework itself shows one refutable candidate out of ten examined, suggesting some evolutionary privacy mechanisms exist but are not densely represented. The secret-protected clustering method appears more novel, with zero refutable candidates among ten examined. However, the theoretical formalization of secret protection encountered four refutable candidates out of ten, indicating that formal privacy relaxations and secret-aware guarantees have received prior theoretical attention, though the specific application to evolutionary text synthesis may be less explored.

Based on the limited search scope of thirty semantically similar papers, the work appears to occupy a relatively novel position within secret-aware evolutionary synthesis. The framework's combination of selective privacy and iterative refinement addresses a gap between uniform-noise methods and application-specific approaches, though the theoretical foundations draw on existing relaxations of differential privacy. The analysis does not cover exhaustive citation networks or domain-specific venues, so additional related work may exist beyond the top-K semantic matches examined.

Taxonomy

Core-task Taxonomy Papers

Claimed Contributions

Contribution Candidate Papers Compared

Refutable Paper

Research Landscape Overview

Core task: differentially private synthetic text generation. The field has organized itself into several major branches that reflect both methodological diversity and application-driven concerns. Core DP Text Generation Methods encompass foundational techniques—ranging from evolutionary and iterative synthesis approaches (such as Secret Protected Evolution[0] and Selective Privacy[35]) to knowledge distillation and GAN-based frameworks (e.g., Private GAN[3], Private Knowledge Distillation[4])—that directly tackle the challenge of generating text under formal privacy guarantees. Application-Specific DP Text Generation addresses domain needs in healthcare (Healthcare Synthetic Data[18], Term2Note[21]), recommendation systems (Recommendation Systems[20]), and instruction tuning (Private Instructions[10]), while DP Generative Models for Non-Text Modalities extends similar privacy mechanisms to images and tabular data (Medical Convolutional GANs[7], Echonet Synthetic[9]). Meanwhile, Evaluation, Privacy Metrics, and Theoretical Foundations (Synthetic Privacy Metrics[17], Evaluation Metrics Review[33]) provide the analytical backbone, and Specialized DP Mechanisms and Extensions explore advanced privacy notions (User Level Privacy[36], Formal Privacy Guarantees[37]) alongside Privacy-Utility Trade-offs and Practical Considerations that guide real-world deployment. Within the evolutionary and iterative synthesis cluster, a particularly active line of work focuses on secret-aware and selective privacy mechanisms that protect only sensitive portions of data rather than applying uniform noise. Secret Protected Evolution[0] exemplifies this direction by evolving synthetic text while explicitly safeguarding designated secrets, contrasting with broader approaches like Selective Privacy[35] that allow fine-grained control over which attributes receive protection. These methods sit at the intersection of iterative refinement and targeted privacy, differing from end-to-end generative models (Private GAN[3]) that treat all data uniformly and from distillation-based techniques (Private Knowledge Distillation[4]) that transfer knowledge under global privacy budgets. The trade-off between granular secret protection and overall utility remains a central open question, as does the scalability of evolutionary strategies compared to one-shot generation pipelines like those in Synthetic Text APIs[5] or RAG-based frameworks (RAG Differential Privacy[2]).

Claimed Contributions

Secret-Protected Evolution (SecPE) framework

Can Refute

10 retrieved papers

The authors introduce SecPE, a framework that shifts from uniform differential privacy guarantees to secret-aware protection. This framework provides (p,r)-secret protection, which relaxes Gaussian DP by requiring protection only at specific prior points rather than over the entire trade-off curve, enabling tighter utility-privacy trade-offs.

10 retrieved papers

Can Refute

Secret-protected clustering method

10 retrieved papers

The authors propose a clustering-based method that detects sensitive attributes and forms representative centers by updating public clusters with noisy private data. This approach reduces computational complexity from O(M*N_syn) to O(K*N_syn), where K is much smaller than M, enabling scalability to larger datasets.

10 retrieved papers

Theoretical formalization of secret protection for text generation

Can Refute

10 retrieved papers

The authors provide a theoretical framework showing that their method satisfies (p,r)-secret protection, which is a relaxation of Gaussian differential privacy. This formalization bounds the reconstruction success probability calibrated to specific secrets rather than enforcing uniform protection across all records.

10 retrieved papers

Can Refute

Core Task Comparisons

Comparisons with papers in the same taxonomy category

[35] Selective differential privacy for language modeling PDF

Shi, Weiyan (2022)

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

Secret-Protected Evolution (SecPE) framework

[68] Secret Specification Based Personalized Privacy-Preserving Analysis in Big Data PDF

Can Refute

[61] Just fine-tune twice: Selective differential privacy for large language models PDF

Cannot Refute

[62] A federated learning scheme based on personalized differential privacy and secret sharing PDF

Cannot Refute

[63] Individualized PATE: Differentially Private Machine Learning with Individual Privacy Guarantees PDF

Cannot Refute

[64] Enhancing Scalability of Metric Differential Privacy via Secret Dataset Partitioning and Benders Decomposition PDF

Cannot Refute

[65] Sensitivity-Aware Personalized Differential Privacy Guarantees for Online Social Networks PDF

Cannot Refute

[66] Mitigating Privacy-Utility Trade-off in Decentralized Federated Learning via f-Differential Privacy PDF

Cannot Refute

[67] Statistic Maximal Leakage PDF

Cannot Refute

[69] A Privacy Protection Method for Power User Profiles That Integrates Improved Differential Privacy and Secret Sharing PDF

Cannot Refute

[70] Almost k-Step Opacity Enforcement in Stochastic Discrete-Event Systems via Differential Privacy PDF

Cannot Refute

Contribution

Secret-protected clustering method

[51] A review of anonymization algorithms and methods in big data PDF

Cannot Refute

[52] Secure fair aggregation based on category grouping in federated learning PDF

Cannot Refute

[53] A clusteringâbased anonymization approach for privacyâpreserving in the healthcare cloud PDF

Cannot Refute

[54] Towards Correlated Data Trading for High-Dimensional Private Data PDF

Cannot Refute

[55] SafeGen: safeguarding privacy and fairness through a genetic method PDF

Cannot Refute

[56] Synthetic Data PDF

Cannot Refute

[57] Personalized trajectory privacy-preserving method based on sensitive attribute generalization and location perturbation PDF

Cannot Refute

[58] Active learning with fairness-aware clustering for fair classification considering multiple sensitive attributes PDF

Cannot Refute

[59] LAPEPâLightweight Authentication Protocol with Enhanced Privacy for effective secured communication in vehicular ad-hoc network PDF

Cannot Refute

[60] Differentially Private -Means Clustering Applied to Meter Data Analysis and Synthesis PDF

Cannot Refute

Contribution

Theoretical formalization of secret protection for text generation

[35] Selective differential privacy for language modeling PDF

Can Refute

[61] Just fine-tune twice: Selective differential privacy for large language models PDF

Can Refute

[73] Broadening the scope of differential privacy using metrics PDF

Can Refute

[75] One-sided differential privacy PDF

Can Refute

[63] Individualized PATE: Differentially Private Machine Learning with Individual Privacy Guarantees PDF

Cannot Refute

[71] PAnDA: Rethinking Metric Differential Privacy Optimization at Scale with Anchor-Based Approximation PDF

Cannot Refute

[72] Local differential privacy-based federated learning under personalized settings PDF

Cannot Refute

[74] Image pixelization with differential privacy PDF

Cannot Refute

[76] A differential privacy framework for matrix factorization recommender systems PDF

Cannot Refute

[77] Securing approximate homomorphic encryption using differential privacy PDF

Cannot Refute

Secret-Protected Evolution for Differentially Private Synthetic Text Generation

Overview

Overall Novelty Assessment

Taxonomy

Research Landscape Overview

Claimed Contributions

Core Task Comparisons

[35] Selective differential privacy for language modeling PDF

Contribution Analysis

Secret-Protected Evolution (SecPE) framework

[68] Secret Specification Based Personalized Privacy-Preserving Analysis in Big Data PDF

[61] Just fine-tune twice: Selective differential privacy for large language models PDF

[62] A federated learning scheme based on personalized differential privacy and secret sharing PDF

[63] Individualized PATE: Differentially Private Machine Learning with Individual Privacy Guarantees PDF

[64] Enhancing Scalability of Metric Differential Privacy via Secret Dataset Partitioning and Benders Decomposition PDF

[65] Sensitivity-Aware Personalized Differential Privacy Guarantees for Online Social Networks PDF

[66] Mitigating Privacy-Utility Trade-off in Decentralized Federated Learning via f-Differential Privacy PDF

[67] Statistic Maximal Leakage PDF

[69] A Privacy Protection Method for Power User Profiles That Integrates Improved Differential Privacy and Secret Sharing PDF

[70] Almost k-Step Opacity Enforcement in Stochastic Discrete-Event Systems via Differential Privacy PDF

Secret-protected clustering method

[51] A review of anonymization algorithms and methods in big data PDF

[52] Secure fair aggregation based on category grouping in federated learning PDF

[53] A clusteringâbased anonymization approach for privacyâpreserving in the healthcare cloud PDF

[54] Towards Correlated Data Trading for High-Dimensional Private Data PDF

[55] SafeGen: safeguarding privacy and fairness through a genetic method PDF

[56] Synthetic Data PDF

[57] Personalized trajectory privacy-preserving method based on sensitive attribute generalization and location perturbation PDF

[58] Active learning with fairness-aware clustering for fair classification considering multiple sensitive attributes PDF

[59] LAPEPâLightweight Authentication Protocol with Enhanced Privacy for effective secured communication in vehicular ad-hoc network PDF

[60] Differentially Private -Means Clustering Applied to Meter Data Analysis and Synthesis PDF

Theoretical formalization of secret protection for text generation

[35] Selective differential privacy for language modeling PDF

[61] Just fine-tune twice: Selective differential privacy for large language models PDF

[73] Broadening the scope of differential privacy using metrics PDF

[75] One-sided differential privacy PDF

[63] Individualized PATE: Differentially Private Machine Learning with Individual Privacy Guarantees PDF

[71] PAnDA: Rethinking Metric Differential Privacy Optimization at Scale with Anchor-Based Approximation PDF

[72] Local differential privacy-based federated learning under personalized settings PDF

[74] Image pixelization with differential privacy PDF

[76] A differential privacy framework for matrix factorization recommender systems PDF

[77] Securing approximate homomorphic encryption using differential privacy PDF

Table of Contents

[53] A clusteringâbased anonymization approach for privacyâpreserving in the healthcare cloud PDF

[59] LAPEPâLightweight Authentication Protocol with Enhanced Privacy for effective secured communication in vehicular ad-hoc network PDF