Doxing via the Lens: Revealing Location-related Privacy Leakage on Multi-modal Large Reasoning Models
Overview
Overall Novelty Assessment
The paper introduces DoxBench, a 500-image dataset with a three-level privacy risk framework, alongside ClueMiner and GeoMiner tools for analyzing geolocation inference attacks by multi-modal models. It resides in the Privacy Risk Assessment and Mitigation leaf, which contains five papers total—a moderately populated niche within the broader 50-paper taxonomy. This leaf focuses specifically on identifying and quantifying privacy threats from geolocation systems, distinguishing it from performance-oriented benchmarking or model development branches. The work addresses adversarial doxing scenarios, a narrower framing than general geo-privacy policy or mitigation strategies explored by sibling papers.
The taxonomy reveals that Privacy Risk Assessment sits under Evaluation, Benchmarking, and Privacy Analysis, adjacent to Benchmark Datasets and Comparative Evaluation and Geospatial AI and Trajectory Prediction. Neighboring branches like Multi-Modal Foundation Model Architectures focus on advancing model capabilities (e.g., Gaea, LLMGeo), while Specialized Geolocation Contexts address domain-specific challenges such as disaster response or indoor localization. The paper's emphasis on adversarial exploitation of existing models contrasts with these capability-building efforts, positioning it as a critical counterpoint that examines societal risks rather than technical performance gains. Its scope excludes mitigation mechanisms beyond analysis, per the leaf's exclude_note.
Among 28 candidates examined, none clearly refute the three core contributions. The DoxBench dataset and privacy framework (8 candidates, 0 refutable) appear novel in their focus on real-world doxing scenarios with structured risk categorization. ClueMiner (10 candidates, 0 refutable) and GeoMiner (10 candidates, 0 refutable) show no direct prior work within the limited search scope. Sibling papers like Geolocation Privacy Risks and Granular Privacy Control address related privacy concerns but do not present equivalent datasets or collaborative attack frameworks. The absence of refutable candidates suggests these specific artifacts are new, though the search scale limits certainty about exhaustive prior work.
Based on top-28 semantic matches, the work introduces concrete evaluation artifacts—dataset, risk taxonomy, and attack tools—that fill a gap in adversarial privacy analysis for geolocation models. The limited search scope means undiscovered prior work may exist, particularly in adjacent security or privacy communities outside the core geolocation literature. The contributions appear incremental in concept (privacy risks are known) but novel in execution (structured benchmarks for doxing attacks). Further investigation of broader security venues would strengthen confidence in this assessment.
Taxonomy
Research Landscape Overview
Claimed Contributions
The authors introduce a novel three-level framework (individual risk, household risk, and both) grounded in GDPR and CCPA regulations to categorize privacy risks in images. They also construct DOXBENCH, a benchmark dataset of 500 high-resolution images from California representing diverse privacy scenarios across six categories to evaluate location-related privacy leakage.
The authors develop CLUEMINER, a test-time adaptation algorithm that iteratively derives unified semantic categories of visual clues from unstructured model reasoning outputs. This tool reveals that MLRMs frequently rely on privacy-sensitive visual clues without built-in mechanisms to suppress such usage.
The authors propose GEOMINER, a two-stage framework simulating realistic adversarial scenarios where a Detector MLLM extracts visual clues and an Analyzer MLLM uses them for geolocation inference. This framework demonstrates how attackers can amplify location-related privacy leakage by providing contextual hints to MLLMs.
Core Task Comparisons
Comparisons with papers in the same taxonomy category
[8] Evaluation of Geolocation Capabilities of Multimodal Large Language Models and Analysis of Associated Privacy Risks PDF
[14] Granular privacy control for geolocation with vision language models PDF
[20] GeoLocator: A location-integrated large multimodal model (LMM) for inferring geo-privacy PDF
[44] AI Knows Where You Are: Exposure, Bias, and Inference in Multimodal Geolocation with KoreaGEO PDF
Contribution Analysis
Detailed comparisons for each claimed contribution
DOXBENCH dataset and three-level privacy risk framework
The authors introduce a novel three-level framework (individual risk, household risk, and both) grounded in GDPR and CCPA regulations to categorize privacy risks in images. They also construct DOXBENCH, a benchmark dataset of 500 high-resolution images from California representing diverse privacy scenarios across six categories to evaluate location-related privacy leakage.
[51] Image-based geolocation using large vision-language models PDF
[52] Where you go is who you are: a study on machine learning based semantic privacy attacks PDF
[53] The long road to computational location privacy: A survey PDF
[54] From object obfuscation to contextually-dependent identification: enhancing automated privacy protection in street-level image platforms (SLIPs) PDF
[55] Systematic Evaluation of Geolocation Privacy Mechanisms PDF
[56] Context Adaptive Personalized Privacy for Location-based Systems PDF
[57] Cardea: Context-aware visual privacy protection for photo taking and sharing PDF
[58] Protecting location privacy in mobile geoservices using fuzzy inference systems PDF
CLUEMINER analysis tool
The authors develop CLUEMINER, a test-time adaptation algorithm that iteratively derives unified semantic categories of visual clues from unstructured model reasoning outputs. This tool reveals that MLRMs frequently rely on privacy-sensitive visual clues without built-in mechanisms to suppress such usage.
[8] Evaluation of Geolocation Capabilities of Multimodal Large Language Models and Analysis of Associated Privacy Risks PDF
[59] Using generative AI to investigate medical imagery models and datasets PDF
[60] Visual content privacy protection: A survey PDF
[61] Privacy-preserving visual localization with event cameras PDF
[62] You can use but cannot recognize: Preserving visual privacy in deep neural networks PDF
[63] Image-guided topic modeling for interpretable privacy classification PDF
[64] Deep gated multi-modal fusion for image privacy prediction PDF
[65] Learning Privacy from Visual Entities PDF
[66] A user-centric context-aware framework for real-time optimisation of multimedia data privacy protection, and information retention within multimodal AI systems PDF
[67] Connecting Visual Data to Privacy: Predicting and Measuring Privacy Risks in Images PDF
GEOMINER collaborative attack framework
The authors propose GEOMINER, a two-stage framework simulating realistic adversarial scenarios where a Detector MLLM extracts visual clues and an Analyzer MLLM uses them for geolocation inference. This framework demonstrates how attackers can amplify location-related privacy leakage by providing contextual hints to MLLMs.