MRAD: Zero-Shot Anomaly Detection with Memory-Driven Retrieval
Overview
Overall Novelty Assessment
The paper proposes MRAD, a memory-retrieval framework for zero-shot anomaly detection that replaces parametric fitting with direct similarity-based retrieval from two-level memory banks. It resides in the 'Pseudo-Anomaly and Correlation-Weighted Approaches' leaf under Vision-Language Model-Based Industrial Anomaly Detection, alongside two sibling papers. This leaf represents a focused research direction within the broader taxonomy of 25 papers across multiple modalities, suggesting a moderately active but not overcrowded area where CLIP-based industrial defect detection methods explore different strategies for zero-shot generalization.
The taxonomy reveals that MRAD's leaf sits within a larger branch of Vision-Language Model-Based Industrial Anomaly Detection, which also includes Multi-Scale Memory Comparison Frameworks and Additive Manufacturing Anomaly Detection. Neighboring branches address Video Anomaly Detection with Temporal Memory and Log Anomaly Detection with Retrieval Augmentation, indicating that memory-driven retrieval is a cross-cutting theme across modalities. The scope note for MRAD's leaf emphasizes pseudo-anomaly generation and correlation weighting, while explicitly excluding multi-scale memory comparison methods that appear in a sibling leaf, suggesting MRAD's single-scale retrieval approach occupies a distinct methodological niche.
Among the three contributions analyzed, the core MRAD framework examined ten candidates and found one refutable prior work, indicating some overlap in the memory-retrieval paradigm within the limited search scope. The MRAD-FT variant examined four candidates with no clear refutations, suggesting its lightweight fine-tuning approach may be more novel. The MRAD-CLIP variant examined ten candidates and found two refutable instances, implying that region-prior-guided dynamic prompts have more substantial prior exploration. These statistics reflect a search of 24 total candidates, not an exhaustive literature review, so the presence of refutable work indicates overlap within this specific sample rather than definitive lack of novelty.
Based on the limited search scope of 24 semantically similar candidates, the work appears to offer incremental refinements to memory-driven retrieval in zero-shot anomaly detection, with the MRAD-FT variant showing the least prior overlap. The taxonomy structure suggests the paper operates in a moderately explored area where CLIP-based industrial methods are actively being developed, though the specific combination of train-free retrieval and lightweight variants may differentiate it from existing pseudo-anomaly and correlation-weighted approaches.
Taxonomy
Research Landscape Overview
Claimed Contributions
The authors introduce MRAD, a framework that constructs a two-level memory bank (image-level and pixel-level) from auxiliary data and performs anomaly detection through direct similarity retrieval rather than parametric model fitting. This approach stores feature-label pairs explicitly and obtains anomaly scores via retrieval during inference.
Building on the train-free base model, the authors propose MRAD-FT which adds only two linear layers to calibrate the retrieval metric. This lightweight fine-tuning improves discriminative ability for both classification and segmentation tasks while maintaining low training cost.
The authors develop MRAD-CLIP which enhances traditional prompt learning by injecting normal and anomalous region priors from MRAD-FT into learnable CLIP text prompts as dynamic biases. This approach improves cross-modal alignment, anomaly localization, and generalization to unseen categories compared to conventional dynamic prompt methods.
Core Task Comparisons
Comparisons with papers in the same taxonomy category
[1] PA-CLIP: Enhancing Zero-Shot Anomaly Detection through Pseudo-Anomaly Awareness PDF
[2] A Training-Free Correlation-Weighted Model for Zero-/Few-Shot Industrial Anomaly Detection with Retrieval Augmentation PDF
Contribution Analysis
Detailed comparisons for each claimed contribution
MRAD framework with memory-driven retrieval paradigm
The authors introduce MRAD, a framework that constructs a two-level memory bank (image-level and pixel-level) from auxiliary data and performs anomaly detection through direct similarity retrieval rather than parametric model fitting. This approach stores feature-label pairs explicitly and obtains anomaly scores via retrieval during inference.
[27] Memorizing normality to detect anomaly: Memory-augmented deep autoencoder for unsupervised anomaly detection PDF
[9] RAGLog: Log Anomaly Detection using Retrieval Augmented Generation PDF
[26] Visual anomaly detection via partition memory bank module and error estimation PDF
[28] FedDyMem: Efficient Federated Learning with Dynamic Memory and Memory-Reduce for Unsupervised Image Anomaly Detection PDF
[29] HyADS: A Hybrid Lightweight Anomaly Detection Framework for Edge-Based Industrial Systems with Limited Data PDF
[30] Appearance-Motion Memory Consistency Network for Video Anomaly Detection PDF
[31] Diffusion for out-of-distribution detection on road scenes and beyond PDF
[32] Learning Memory-guided Normality for Anomaly Detection PDF
[33] RAN Cortex: Memory-Augmented Intelligence for Context-Aware Decision-Making in AI-Native Networks PDF
[34] Anomaly detection with dual-stream memory network PDF
MRAD-FT variant with lightweight fine-tuning
Building on the train-free base model, the authors propose MRAD-FT which adds only two linear layers to calibrate the retrieval metric. This lightweight fine-tuning improves discriminative ability for both classification and segmentation tasks while maintaining low training cost.
[45] CLIP-AD: A Language-Guided Staged Dual-Path Model for Zero-shot Anomaly Detection PDF
[46] Intelligent Anomaly Detection for Lane Rendering Using Transformer with Self-Supervised Pretraining and Customized Fine-Tuning PDF
[47] REMEMBER: Retrieval-based Explainable Multimodal Evidence-guided Modeling for Brain Evaluation and Reasoning in Zero-and Few-shot Neurodegenerative ⦠PDF
[48] LogTIW:A log anomaly detection model based on TF-IDF weighted semantic features PDF
MRAD-CLIP variant with region-prior-guided dynamic prompts
The authors develop MRAD-CLIP which enhances traditional prompt learning by injecting normal and anomalous region priors from MRAD-FT into learnable CLIP text prompts as dynamic biases. This approach improves cross-modal alignment, anomaly localization, and generalization to unseen categories compared to conventional dynamic prompt methods.