Vision Hopfield Memory Networks

ICLR 2026 Conference SubmissionAnonymous Authors

OpenReview Score: 6.5 Download Report PDF

Associative MemoryHopfield NetworksImage Classification

Recent vision and multimodal foundation backbones, such as Transformer families and state-space models like Mamba, have achieved remarkable progress, enabling unified modeling across images, text, and beyond. Despite their empirical success, these architectures remain far from the computational principles of the human brain, often demanding enormous amounts of training data while offering limited interpretability. In this work, we propose the Vision Hopfield Memory Network (V-HMN), a brain-inspired foundation backbone that integrates hierarchical memory mechanisms with iterative refinement updates. Specifically, V-HMN incorporates local Hopfield modules that provide associative memory dynamics at the image patch level, global Hopfield modules that function as episodic memory for contextual modulation, and a predictive-coding–inspired refinement rule for iterative error correction. By organizing these memory-based modules hierarchically, V-HMN captures both local and global dynamics in a unified framework. Memory retrieval exposes the relationship between inputs and stored patterns, making decisions more interpretable, while the reuse of stored patterns improves data efficiency. This brain-inspired design therefore enhances interpretability and data efficiency beyond existing self-attention- or state-space–based approaches. We conducted extensive experiments on public computer vision benchmarks, and V-HMN achieved competitive results against widely adopted backbone architectures, while offering better interpretability, higher data efficency, and stronger biological plausibility. These findings highlight the potential of V-HMN to serve as a next-generation vision foundation model, while also providing a generalizable blueprint for multimodal backbones in domains such as text and audio, thereby bridging brain-inspired computation with large-scale machine learning.

Abstract:

Disclaimer

This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.

NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.

If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Taxonomy

Core-task Taxonomy Papers

Claimed Contributions

Contribution Candidate Papers Compared

Refutable Paper

Research Landscape Overview

Core task: brain-inspired vision backbone with hierarchical memory mechanisms. The field encompasses diverse approaches to integrating memory and biological principles into visual processing systems. At the broadest level, the taxonomy reveals several major branches: hierarchical memory architectures that organize multi-level storage for vision tasks, neuromorphic hardware implementations that exploit in-memory computing substrates, spiking neural networks that mimic temporal dynamics of biological neurons, attention and recurrent mechanisms that enable iterative refinement, biologically inspired feature extraction methods that mirror cortical organization, memory-augmented learning frameworks for recognition, hierarchical temporal memory (HTM) algorithms, computational models of visual cortex structure, and neuroscience-informed cognitive architectures. Some branches emphasize hardware efficiency and novel computing paradigms (Neuromorphic Visual Resistive[3], In-sensor Image Memorization[4]), while others focus on algorithmic innovations such as spiking dynamics (Hierarchical Spiking Classification[6]) or cortex-inspired hierarchies (Hierarchical Activation Backbone[10]). The interplay between these branches reflects ongoing efforts to balance biological fidelity, computational tractability, and practical performance. Particularly active lines of work explore multi-memory integration frameworks for embodied agents, where systems must coordinate short-term sensorimotor memory with long-term episodic storage—exemplified by RoboMemory Lifelong[1] and RoboMemory Interactive[2], which address continual learning and interactive scenarios in robotics. Vision Hopfield Memory[0] sits within this cluster, emphasizing associative memory mechanisms that enable robust retrieval and pattern completion in visual backbones. Compared to the RoboMemory works that target embodied agent workflows, Vision Hopfield Memory[0] focuses more directly on the backbone architecture itself, leveraging Hopfield-style dynamics to create hierarchical memory layers. This contrasts with approaches like Neural Brain Framework[16], which integrates broader cognitive modeling, and with hardware-centric efforts such as Hierarchical Interactive In-memory[5] that prioritize physical substrate design. The central trade-off across these directions involves the granularity of memory organization, the degree of biological inspiration, and the balance between general-purpose learning and task-specific optimization.

Claimed Contributions

Vision Hopfield Memory Network (V-HMN) architecture

8 retrieved papers

The authors introduce V-HMN, a novel vision backbone that replaces conventional self-attention or convolution with hierarchical Hopfield-style associative memory modules. The architecture combines local memory for patch-level pattern completion and global memory for scene-level context, organized in a unified framework with iterative refinement.

8 retrieved papers

Predictive-coding–inspired iterative refinement mechanism

Can Refute

10 retrieved papers

The authors develop a lightweight refinement update rule where representations are gradually corrected toward memory-predicted prototypes through learnable error-correction steps. This mechanism provides an interpretable, brain-inspired alternative to purely feedforward processing.

10 retrieved papers

Can Refute

Class-balanced persistent memory banks with content-addressable retrieval

2 retrieved papers

The authors design explicit memory banks that store real sample embeddings in a class-balanced manner during training and remain frozen during inference. These banks enable content-addressable retrieval where stored prototypes act as reusable priors, improving data efficiency and interpretability.

2 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

[1] RoboMemory: A Brain-inspired Multi-memory Agentic Framework for Lifelong Learning in Physical Embodied Systems PDF

Cai, Honghao, Mingcong Lei, Honghao Cai, Binbin Que, Zezhou Cui, Liangchen Tan, Junkun Hong, Wu YiMou, Shuangyu Zhu, Gehan Hu, Jiang Shao-han, Yimou Wu, Wang Ge, Yang Yuyuan, Shaohan Jiang, Tan Jun-yuan, Ge Wang, Yuyuan Yang, Wan ZhengLin, Junyuan Tan, Li Zhen, Zhenglin Wan, Shuguang Cui, Cui, Shuguang, Yiming Zhao, Zhen Li, Zhao Yiming, Yatong Han (2025)

[2] RoboMemory: A Brain-inspired Multi-memory Agentic Framework for Interactive Environmental Learning in Physical Embodied Systems PDF

Cai, Honghao, Wu YiMou, Jiang Shao-han, Wang Ge, Yang Yuyuan, Tan Jun-yuan, Wan ZhengLin, Li Zhen, Cui, Shuguang, Zhao Yiming (2025)

[16] Neural Brain: A Neuroscience-inspired Framework for Embodied Agents PDF

Liu Jian, Jian Liu, Xiongtao Shi, Zhang Haitian, Thai Duy Nguyen, Zhang Tian-xiang, Haitian Zhang, Sun Wei, Tianxiang Zhang, Li YanJie, Wei Sun, Vasilakos, Athanasios V., Yanjie Li, Iacca, Giovanni, Athanasios V. Vasilakos, Khan, Arshad Ali, Giovanni Iacca, Kumar, Arvind, Arshad Ali Khan, Cho, Jae Won, Arvind Kumar, Mian, Ajmal, Jae Won Cho, Xie Li-hua, A. Mian, Cambria, Erik, Lihua Xie, Wang Li-n, Erik Cambria, Lin Wang (2025)

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

Vision Hopfield Memory Network (V-HMN) architecture

[51] Semantic enhancement and multi-level alignment network for cross-modal retrieval PDF

Cannot Refute

[52] A universal abstraction for hierarchical hopfield networks PDF

Cannot Refute

[53] STanhop: Sparse tandem hopfield model for memory-enhanced time series prediction PDF

Cannot Refute

[54] iMixer: hierarchical Hopfield network implies an invertible, implicit and iterative MLP-Mixer PDF

Cannot Refute

[55] Energy-Based Learning and the Evolution of Hopfield Networks: From Boltzmann Machines to Transformer Attention Mechanisms PDF

Cannot Refute

[56] Out-of-Distribution Nuclei Segmentation in Histology Imaging via Liquid Neural Networks with Modern Hopfield Layer PDF

Cannot Refute

[57] Entropy driven artificial neuronal networks and sensorial representation: A proposal PDF

Cannot Refute

[58] Hierarchical Hopfield Network Decomposition: A Spiked Covariance Framework for Latent Prototype Discovery PDF

Cannot Refute

Contribution

Predictive-coding–inspired iterative refinement mechanism

[65] Associative memories via predictive coding PDF

Can Refute

[66] Neural elements for predictive coding PDF

Can Refute

[61] Anopcn: Video anomaly detection via deep predictive coding network PDF

Cannot Refute

[62] Tight stability, convergence, and robustness bounds for predictive coding networks PDF

Cannot Refute

[63] ActPC-Geom: Towards Scalable Online Neural-Symbolic Learning via Accelerating Active Predictive Coding with Information Geometry & Diverse Cognitive â¦ PDF

Cannot Refute

[64] Neurocomputational Mechanisms of Sense of Agency: Literature Review for Integrating Predictive Coding and Adaptive Control in HumanâMachine Interfaces PDF

Cannot Refute

[67] Hybrid predictive coding: Inferring, fast and slow PDF

Cannot Refute

[68] An approximation of the error backpropagation algorithm in a predictive coding network with local hebbian synaptic plasticity PDF

Cannot Refute

[69] Modelling Predictive Coding in the Primary Visual Cortex (V1): Layer 4 Receptive Field Properties in a Balanced Recurrent Spiking Neuronal Network PDF

Cannot Refute

[70] Towards the Training of Deeper Predictive Coding Neural Networks PDF

Cannot Refute

Contribution

Class-balanced persistent memory banks with content-addressable retrieval

[59] Applied LLaMA: Systems, Methods, and Implementations PDF

Cannot Refute

[60] AMDNet23: A combined deep Contour-based Convolutional Neural Network and Long Short Term Memory system to diagnose Age-related Macular Degeneration PDF

Cannot Refute

Vision Hopfield Memory Networks

Overview

Taxonomy

Research Landscape Overview

Claimed Contributions

Core Task Comparisons

[1] RoboMemory: A Brain-inspired Multi-memory Agentic Framework for Lifelong Learning in Physical Embodied Systems PDF

[2] RoboMemory: A Brain-inspired Multi-memory Agentic Framework for Interactive Environmental Learning in Physical Embodied Systems PDF

[16] Neural Brain: A Neuroscience-inspired Framework for Embodied Agents PDF

Contribution Analysis

Vision Hopfield Memory Network (V-HMN) architecture

[51] Semantic enhancement and multi-level alignment network for cross-modal retrieval PDF

[52] A universal abstraction for hierarchical hopfield networks PDF

[53] STanhop: Sparse tandem hopfield model for memory-enhanced time series prediction PDF

[54] iMixer: hierarchical Hopfield network implies an invertible, implicit and iterative MLP-Mixer PDF

[55] Energy-Based Learning and the Evolution of Hopfield Networks: From Boltzmann Machines to Transformer Attention Mechanisms PDF

[56] Out-of-Distribution Nuclei Segmentation in Histology Imaging via Liquid Neural Networks with Modern Hopfield Layer PDF

[57] Entropy driven artificial neuronal networks and sensorial representation: A proposal PDF

[58] Hierarchical Hopfield Network Decomposition: A Spiked Covariance Framework for Latent Prototype Discovery PDF

Predictive-coding–inspired iterative refinement mechanism

[65] Associative memories via predictive coding PDF

[66] Neural elements for predictive coding PDF

[61] Anopcn: Video anomaly detection via deep predictive coding network PDF

[62] Tight stability, convergence, and robustness bounds for predictive coding networks PDF

[63] ActPC-Geom: Towards Scalable Online Neural-Symbolic Learning via Accelerating Active Predictive Coding with Information Geometry & Diverse Cognitive â¦ PDF

[64] Neurocomputational Mechanisms of Sense of Agency: Literature Review for Integrating Predictive Coding and Adaptive Control in HumanâMachine Interfaces PDF

[67] Hybrid predictive coding: Inferring, fast and slow PDF

[68] An approximation of the error backpropagation algorithm in a predictive coding network with local hebbian synaptic plasticity PDF

[69] Modelling Predictive Coding in the Primary Visual Cortex (V1): Layer 4 Receptive Field Properties in a Balanced Recurrent Spiking Neuronal Network PDF

[70] Towards the Training of Deeper Predictive Coding Neural Networks PDF

Class-balanced persistent memory banks with content-addressable retrieval

[59] Applied LLaMA: Systems, Methods, and Implementations PDF

[60] AMDNet23: A combined deep Contour-based Convolutional Neural Network and Long Short Term Memory system to diagnose Age-related Macular Degeneration PDF

Table of Contents

[63] ActPC-Geom: Towards Scalable Online Neural-Symbolic Learning via Accelerating Active Predictive Coding with Information Geometry & Diverse Cognitive â¦ PDF

[64] Neurocomputational Mechanisms of Sense of Agency: Literature Review for Integrating Predictive Coding and Adaptive Control in HumanâMachine Interfaces PDF