AWM: Accurate Weight-Matrix Fingerprint for Large Language Models

ICLR 2026 Conference SubmissionAnonymous Authors
fingerprintlarge language modelsintellectual property
Abstract:

Protecting the intellectual property of large language models (LLMs) is crucial, given the substantial resources required for their training. Consequently, there is an urgent need for both model owners and third parties to determine whether a suspect LLM is trained from scratch or derived from an existing base model. However, the intensive post-training processes that models typically undergo—such as supervised fine-tuning, extensive continued pretraining, reinforcement learning, multi-modal extension, pruning, and upcycling—pose significant challenges to reliable identification. In this work, we propose a training-free fingerprinting method based on weight matrices. We leverage the Linear Assignment Problem (LAP) and an unbiased Centered Kernel Alignment (CKA) similarity to neutralize the effects of parameter manipulations, yielding a highly robust and high-fidelity similarity metric. On a comprehensive testbed of 60 positive and 90 negative model pairs, our method demonstrates exceptional robustness against all six aforementioned post-training categories while exhibiting a near-zero risk of false positives. By achieving perfect scores on all classification metrics, our approach establishes a strong basis for reliable model lineage verification. Moreover, the entire computation completes within 30s on an NVIDIA 3090 GPU.

Disclaimer
This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.
NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.
If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper proposes a training-free fingerprinting method using weight matrices, Linear Assignment Problem (LAP), and unbiased Centered Kernel Alignment (CKA) to verify whether a suspect LLM derives from a base model. It sits within the Weight-Based Fingerprinting Approaches leaf, which contains four papers total including this one. This leaf represents a focused but not overcrowded research direction, with sibling works exploring gradient-based fingerprinting, intrinsic fingerprints from initialization artifacts, and seed-based signatures. The taxonomy shows this is an active subfield within the broader Model Derivation Detection and Fingerprinting branch.

The taxonomy reveals neighboring leaves addressing Behavioral Similarity and Provenance Testing (five papers analyzing output patterns and functional representations) and Spectral and Structural Signature Methods (two papers leveraging spectral properties). The paper's weight-based approach contrasts with these behavioral methods that probe model outputs rather than internal parameters. The scope_note clarifies that weight-based techniques focus on parameter distributions and gradient information, while excluding behavioral or output-based methods. This positioning suggests the work bridges parameter-level analysis with robustness concerns typically addressed by spectral approaches.

Among twenty-four candidates examined across three contributions, the analysis found limited prior work overlap. The core weight-matrix fingerprinting contribution examined ten candidates with one potentially refutable match, while the LAP-enhanced CKA metric examined four candidates with none refutable. The comprehensive robustness claim examined ten candidates with one overlap. These statistics indicate that within the top-24 semantic matches, most contributions appear relatively distinct, though the search scope is explicitly limited and does not constitute an exhaustive literature review. The LAP-CKA combination appears particularly novel within this candidate set.

Based on the limited search scope of twenty-four candidates, the work appears to occupy a relatively distinct position within weight-based fingerprinting, particularly in its combination of LAP and unbiased CKA for robustness. However, the analysis acknowledges it examined only top-K semantic matches plus citation expansion, not the full literature. The taxonomy context suggests this is an evolving subfield where standardization and multi-parent scenarios remain open challenges, positioning the work within an active but not saturated research direction.

Taxonomy

Core-task Taxonomy Papers
50
3
Claimed Contributions
24
Contribution Candidate Papers Compared
2
Refutable Paper

Research Landscape Overview

Core task: Model lineage verification for large language models. As LLMs proliferate through fine-tuning, merging, and distillation, determining the ancestry and derivation relationships among models has become essential for intellectual property enforcement, security auditing, and ecosystem transparency. The field organizes around several complementary branches. Model Derivation Detection and Fingerprinting develops techniques to identify parent-child relationships via weight-based signatures, gradient analysis, and intrinsic model properties, with works like Gradient-Based Fingerprinting[8], Intrinsic Fingerprint[15], and SeedPrints[30] exploring how to extract stable identifiers from model parameters. Model Ownership Protection and Intellectual Property focuses on watermarking and copyright mechanisms to assert provenance claims, exemplified by LLM Copyright Auditing[16] and EchoLeak[21]. Data Provenance Tracking and Verification examines the origins of training corpora and their influence on downstream models, as seen in Data Provenance Project[14] and Dataset Identifiers Signature[38]. LLM Ecosystem Analysis studies the broader network of model relationships on platforms like Hugging Face, with Hugginggraph[20] and Ecosystem LLM Code[4] mapping these genealogies. Meanwhile, LLM-Assisted Verification leverages language models themselves for tasks like multimedia fact-checking or reasoning trace analysis, and Specialized LLM Topics address emerging challenges such as hallucination propagation across model families. A particularly active line of work centers on weight-based fingerprinting, where researchers seek compact, robust signatures that survive fine-tuning and quantization. AWM[0] contributes to this cluster by proposing a method that examines model weights to verify lineage, positioning itself alongside Intrinsic Fingerprint[15] and SeedPrints[30], which similarly extract fingerprints from parameter distributions or initialization artifacts. These approaches contrast with gradient-based methods like Gradient-Based Fingerprinting[8], which require access to training dynamics, and with behavioral fingerprinting techniques such as HaLLMark Effect[3], which probe model outputs rather than internal states. A key trade-off emerges between fingerprint robustness under adversarial modifications and the computational overhead of extraction. Meanwhile, works like Model Provenance Testing[1] and Origin Tracing LLMs[11] explore complementary angles by testing provenance claims through statistical inference or by tracing the flow of knowledge through model families. Open questions remain around standardizing fingerprint formats, handling models derived from multiple parents, and balancing transparency with the need to protect proprietary training recipes.

Claimed Contributions

Training-free weight-matrix fingerprinting method for LLMs

The authors introduce a novel fingerprinting approach that operates directly on weight matrices without requiring additional training. This method leverages the Linear Assignment Problem and an unbiased Centered Kernel Alignment similarity metric to identify whether a suspect LLM is derived from an existing base model or trained from scratch.

10 retrieved papers
Can Refute
LAP-enhanced unbiased CKA similarity metric

The authors develop a robust similarity metric that combines the Linear Assignment Problem to extract permutation and signature matrices from word embeddings with an unbiased variant of Centered Kernel Alignment. This metric is designed to be invariant to various weight manipulations including scaling, permutation, pruning, and rotation.

4 retrieved papers
Comprehensive robustness against six post-training categories

The authors establish that their method maintains perfect classification performance across six challenging post-training scenarios: supervised fine-tuning, extensive continued pretraining, reinforcement learning, multi-modal extension, pruning, and upcycling. This is validated on a testbed of 60 positive and 90 negative model pairs with perfect AUC scores.

10 retrieved papers
Can Refute

Core Task Comparisons

Comparisons with papers in the same taxonomy category

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

Training-free weight-matrix fingerprinting method for LLMs

The authors introduce a novel fingerprinting approach that operates directly on weight matrices without requiring additional training. This method leverages the Linear Assignment Problem and an unbiased Centered Kernel Alignment similarity metric to identify whether a suspect LLM is derived from an existing base model or trained from scratch.

Contribution

LAP-enhanced unbiased CKA similarity metric

The authors develop a robust similarity metric that combines the Linear Assignment Problem to extract permutation and signature matrices from word embeddings with an unbiased variant of Centered Kernel Alignment. This metric is designed to be invariant to various weight manipulations including scaling, permutation, pruning, and rotation.

Contribution

Comprehensive robustness against six post-training categories

The authors establish that their method maintains perfect classification performance across six challenging post-training scenarios: supervised fine-tuning, extensive continued pretraining, reinforcement learning, multi-modal extension, pruning, and upcycling. This is validated on a testbed of 60 positive and 90 negative model pairs with perfect AUC scores.