AWM: Accurate Weight-Matrix Fingerprint for Large Language Models

ICLR 2026 Conference SubmissionAnonymous Authors

OpenReview Score: 6.0 Download Report PDF

fingerprintlarge language modelsintellectual property

Protecting the intellectual property of large language models (LLMs) is crucial, given the substantial resources required for their training. Consequently, there is an urgent need for both model owners and third parties to determine whether a suspect LLM is trained from scratch or derived from an existing base model. However, the intensive post-training processes that models typically undergo—such as supervised fine-tuning, extensive continued pretraining, reinforcement learning, multi-modal extension, pruning, and upcycling—pose significant challenges to reliable identification. In this work, we propose a training-free fingerprinting method based on weight matrices. We leverage the Linear Assignment Problem (LAP) and an unbiased Centered Kernel Alignment (CKA) similarity to neutralize the effects of parameter manipulations, yielding a highly robust and high-fidelity similarity metric. On a comprehensive testbed of 60 positive and 90 negative model pairs, our method demonstrates exceptional robustness against all six aforementioned post-training categories while exhibiting a near-zero risk of false positives. By achieving perfect scores on all classification metrics, our approach establishes a strong basis for reliable model lineage verification. Moreover, the entire computation completes within 30s on an NVIDIA 3090 GPU.

Abstract:

Disclaimer

This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.

NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.

If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper proposes a training-free fingerprinting method using weight matrices, Linear Assignment Problem (LAP), and unbiased Centered Kernel Alignment (CKA) to verify whether a suspect LLM derives from a base model. It sits within the Weight-Based Fingerprinting Approaches leaf, which contains four papers total including this one. This leaf represents a focused but not overcrowded research direction, with sibling works exploring gradient-based fingerprinting, intrinsic fingerprints from initialization artifacts, and seed-based signatures. The taxonomy shows this is an active subfield within the broader Model Derivation Detection and Fingerprinting branch.

The taxonomy reveals neighboring leaves addressing Behavioral Similarity and Provenance Testing (five papers analyzing output patterns and functional representations) and Spectral and Structural Signature Methods (two papers leveraging spectral properties). The paper's weight-based approach contrasts with these behavioral methods that probe model outputs rather than internal parameters. The scope_note clarifies that weight-based techniques focus on parameter distributions and gradient information, while excluding behavioral or output-based methods. This positioning suggests the work bridges parameter-level analysis with robustness concerns typically addressed by spectral approaches.

Among twenty-four candidates examined across three contributions, the analysis found limited prior work overlap. The core weight-matrix fingerprinting contribution examined ten candidates with one potentially refutable match, while the LAP-enhanced CKA metric examined four candidates with none refutable. The comprehensive robustness claim examined ten candidates with one overlap. These statistics indicate that within the top-24 semantic matches, most contributions appear relatively distinct, though the search scope is explicitly limited and does not constitute an exhaustive literature review. The LAP-CKA combination appears particularly novel within this candidate set.

Based on the limited search scope of twenty-four candidates, the work appears to occupy a relatively distinct position within weight-based fingerprinting, particularly in its combination of LAP and unbiased CKA for robustness. However, the analysis acknowledges it examined only top-K semantic matches plus citation expansion, not the full literature. The taxonomy context suggests this is an evolving subfield where standardization and multi-parent scenarios remain open challenges, positioning the work within an active but not saturated research direction.

Taxonomy

Core-task Taxonomy Papers

Claimed Contributions

Contribution Candidate Papers Compared

Refutable Paper

Research Landscape Overview

Core task: Model lineage verification for large language models. As LLMs proliferate through fine-tuning, merging, and distillation, determining the ancestry and derivation relationships among models has become essential for intellectual property enforcement, security auditing, and ecosystem transparency. The field organizes around several complementary branches. Model Derivation Detection and Fingerprinting develops techniques to identify parent-child relationships via weight-based signatures, gradient analysis, and intrinsic model properties, with works like Gradient-Based Fingerprinting[8], Intrinsic Fingerprint[15], and SeedPrints[30] exploring how to extract stable identifiers from model parameters. Model Ownership Protection and Intellectual Property focuses on watermarking and copyright mechanisms to assert provenance claims, exemplified by LLM Copyright Auditing[16] and EchoLeak[21]. Data Provenance Tracking and Verification examines the origins of training corpora and their influence on downstream models, as seen in Data Provenance Project[14] and Dataset Identifiers Signature[38]. LLM Ecosystem Analysis studies the broader network of model relationships on platforms like Hugging Face, with Hugginggraph[20] and Ecosystem LLM Code[4] mapping these genealogies. Meanwhile, LLM-Assisted Verification leverages language models themselves for tasks like multimedia fact-checking or reasoning trace analysis, and Specialized LLM Topics address emerging challenges such as hallucination propagation across model families. A particularly active line of work centers on weight-based fingerprinting, where researchers seek compact, robust signatures that survive fine-tuning and quantization. AWM[0] contributes to this cluster by proposing a method that examines model weights to verify lineage, positioning itself alongside Intrinsic Fingerprint[15] and SeedPrints[30], which similarly extract fingerprints from parameter distributions or initialization artifacts. These approaches contrast with gradient-based methods like Gradient-Based Fingerprinting[8], which require access to training dynamics, and with behavioral fingerprinting techniques such as HaLLMark Effect[3], which probe model outputs rather than internal states. A key trade-off emerges between fingerprint robustness under adversarial modifications and the computational overhead of extraction. Meanwhile, works like Model Provenance Testing[1] and Origin Tracing LLMs[11] explore complementary angles by testing provenance claims through statistical inference or by tracing the flow of knowledge through model families. Open questions remain around standardizing fingerprint formats, handling models derived from multiple parents, and balancing transparency with the need to protect proprietary training recipes.

Claimed Contributions

Training-free weight-matrix fingerprinting method for LLMs

Can Refute

10 retrieved papers

The authors introduce a novel fingerprinting approach that operates directly on weight matrices without requiring additional training. This method leverages the Linear Assignment Problem and an unbiased Centered Kernel Alignment similarity metric to identify whether a suspect LLM is derived from an existing base model or trained from scratch.

10 retrieved papers

Can Refute

LAP-enhanced unbiased CKA similarity metric

4 retrieved papers

The authors develop a robust similarity metric that combines the Linear Assignment Problem to extract permutation and signature matrices from word embeddings with an unbiased variant of Centered Kernel Alignment. This metric is designed to be invariant to various weight manipulations including scaling, permutation, pruning, and rotation.

4 retrieved papers

Comprehensive robustness against six post-training categories

Can Refute

10 retrieved papers

The authors establish that their method maintains perfect classification performance across six challenging post-training scenarios: supervised fine-tuning, extensive continued pretraining, reinforcement learning, multi-modal extension, pruning, and upcycling. This is validated on a testbed of 60 positive and 90 negative model pairs with perfect AUC scores.

10 retrieved papers

Can Refute

Core Task Comparisons

Comparisons with papers in the same taxonomy category

[8] Gradient-Based Model Fingerprinting for LLM Similarity Detection and Family Classification PDF

Wu Zehao, Zhao Yanjie, Zehao Wu, Wang Haoyu, Yanjie Zhao, Haoyu Wang (2025)

[15] Intrinsic Fingerprint of LLMs: Continue Training is NOT All You Need to Steal A Model! PDF

Do Hyeon Yoon, Minsoo Chun, Allen Thomas, MÃ¼ller Hans, Wang Min, Sharma Rajesh (2025)

[30] SeedPrints: Fingerprints Can Even Tell Which Seed Your Large Language Model Was Trained From PDF

Tong Yao, Wang Haonan, LI Siquan, Kawaguchi Kenji, Hu, Tianyang (2025)

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

Training-free weight-matrix fingerprinting method for LLMs

[59] Huref: Human-readable fingerprint for large language models PDF

Can Refute

[51] {LLMmap}: Fingerprinting for large language models PDF

Cannot Refute

[52] Editmf: Drawing an invisible fingerprint for your large language models PDF

Cannot Refute

[53] Reef: Representation encoding fingerprints for large language models PDF

Cannot Refute

[54] Mergeprint: Robust fingerprinting against merging large language models PDF

Cannot Refute

[55] RoFL: Robust Fingerprinting of Language Models PDF

Cannot Refute

[56] Instructional Fingerprinting of Large Language Models PDF

Cannot Refute

[57] Behavioral Fingerprinting of Large Language Models PDF

Cannot Refute

[58] A Fingerprint for Large Language Models PDF

Cannot Refute

[60] Watermarking for large language models: A survey PDF

Cannot Refute

Contribution

LAP-enhanced unbiased CKA similarity metric

[61] ShERPA: Leveraging neuron alignment for knowledgepreserving fine-tuning PDF

Cannot Refute

[62] Do vision and language encoders represent the world similarly? PDF

Cannot Refute

[63] : Cycle-Consistent Multi-Model Merging PDF

Cannot Refute

[64] Optimizing Loss Landscape Connectivity via Neuron Alignment PDF

Cannot Refute

Contribution

Comprehensive robustness against six post-training categories

[59] Huref: Human-readable fingerprint for large language models PDF

Can Refute

[56] Instructional Fingerprinting of Large Language Models PDF

Cannot Refute

[65] Sleepermark: Towards robust watermark against fine-tuning text-to-image diffusion models PDF

Cannot Refute

[66] Tree-rings watermarks: Invisible fingerprints for diffusion images PDF

Cannot Refute

[67] Deep Neural Network Fingerprinting by Conferrable Adversarial Examples PDF

Cannot Refute

[68] A robustness-assured white-box watermark in neural networks PDF

Cannot Refute

[69] Fingerprinting deep neural networks globally via universal adversarial perturbations PDF

Cannot Refute

[70] Robust Watermarking of Tiny Neural Networks by Fine-Tuning and Post-Training Approaches PDF

Cannot Refute

[71] Ft-shield: A watermark against unauthorized fine-tuning in text-to-image diffusion models PDF

Cannot Refute

[72] Deepmarks: A digital fingerprinting framework for deep neural networks PDF

Cannot Refute

AWM: Accurate Weight-Matrix Fingerprint for Large Language Models

Overview

Overall Novelty Assessment

Taxonomy

Research Landscape Overview

Claimed Contributions

Core Task Comparisons

[8] Gradient-Based Model Fingerprinting for LLM Similarity Detection and Family Classification PDF

[15] Intrinsic Fingerprint of LLMs: Continue Training is NOT All You Need to Steal A Model! PDF

[30] SeedPrints: Fingerprints Can Even Tell Which Seed Your Large Language Model Was Trained From PDF

Contribution Analysis

Training-free weight-matrix fingerprinting method for LLMs

[59] Huref: Human-readable fingerprint for large language models PDF

[51] {LLMmap}: Fingerprinting for large language models PDF

[52] Editmf: Drawing an invisible fingerprint for your large language models PDF

[53] Reef: Representation encoding fingerprints for large language models PDF

[54] Mergeprint: Robust fingerprinting against merging large language models PDF

[55] RoFL: Robust Fingerprinting of Language Models PDF

[56] Instructional Fingerprinting of Large Language Models PDF

[57] Behavioral Fingerprinting of Large Language Models PDF

[58] A Fingerprint for Large Language Models PDF

[60] Watermarking for large language models: A survey PDF

LAP-enhanced unbiased CKA similarity metric

[61] ShERPA: Leveraging neuron alignment for knowledgepreserving fine-tuning PDF

[62] Do vision and language encoders represent the world similarly? PDF

[63] : Cycle-Consistent Multi-Model Merging PDF

[64] Optimizing Loss Landscape Connectivity via Neuron Alignment PDF

Comprehensive robustness against six post-training categories

[59] Huref: Human-readable fingerprint for large language models PDF

[56] Instructional Fingerprinting of Large Language Models PDF

[65] Sleepermark: Towards robust watermark against fine-tuning text-to-image diffusion models PDF

[66] Tree-rings watermarks: Invisible fingerprints for diffusion images PDF

[67] Deep Neural Network Fingerprinting by Conferrable Adversarial Examples PDF

[68] A robustness-assured white-box watermark in neural networks PDF

[69] Fingerprinting deep neural networks globally via universal adversarial perturbations PDF

[70] Robust Watermarking of Tiny Neural Networks by Fine-Tuning and Post-Training Approaches PDF

[71] Ft-shield: A watermark against unauthorized fine-tuning in text-to-image diffusion models PDF

[72] Deepmarks: A digital fingerprinting framework for deep neural networks PDF

Table of Contents