SP-MoMamba: Superpixel-driven Mixture of State Space Experts for Efficient Image Super-Resolution

ICLR 2026 Conference SubmissionAnonymous Authors

OpenReview Score: 6.0 Download Report PDF

Low-level vision; Efficient Image Super-resolution; State Space Model

The state space model (SSM) has garnered significant attention recently due to its exceptional long-range modeling capabilities achieved with linear-time complexity, enabling notable success in efficient super-resolution. However, applying SSMs to vision tasks typically requires scanning 2D visual data with a 1D-sequence form, which disrupts inherent semantic relationships and introduces artifacts and distortions during image restoration. To address these challenges, we propose a novel SP-MoMamba method that integrates SSMs with the semantic preservation capability of superpixels and the efficiency advantage of Mixture-of-Experts (MoE). Specifically, we pioneer the use of superpixel features as semantic units to reconstruct the SSM scanning method, proposing the Superpixel-driven State Space Model (SP-SSM) as a basic building block of SP-MoMamba. Furthermore, we introduce the Multi-Scale Superpixel Mixture of State Space Experts (MSS-MoE) scheme to strategically integrate SP-SSMs across scales, effectively harnessing the complementary semantic information from multiple experts. This multi-scale expert integration significantly reduces the number of pixels processed by each SSM while enhancing the reconstruction of fine details through specialized experts operating at different semantic scales. This framework enables our model to deliver superior performance with minimal computational overhead.

Abstract:

Disclaimer

This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.

NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.

If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper proposes SP-MoMamba, which integrates superpixel-driven scanning with Mixture-of-Experts (MoE) routing in state space models for efficient super-resolution. It resides in the 'Mamba-Based Super-Resolution Frameworks' leaf, which contains six papers including Mambair, Mambairv2, and S3SR. This leaf represents a moderately populated research direction within the broader taxonomy of fifty papers across thirty-six topics, indicating active but not overcrowded exploration of foundational Mamba architectures for super-resolution tasks.

The taxonomy reveals that SP-MoMamba sits within 'Core State Space Model Architectures for Super-Resolution,' adjacent to branches exploring hybrid integration (Mamba-Transformer, Mamba-CNN) and modality-specific methods (hyperspectral, light field, video). Neighboring leaves include 'State-Control and Predictive Modeling' and 'Efficient and Lightweight SSM Designs,' which address complementary concerns of dynamic control and parameter reduction. The paper's superpixel-based scanning diverges from typical directional or hierarchical strategies seen in sibling works, while its MoE integration connects conceptually to efficiency-focused branches without crossing into hybrid architectures.

Among eighteen candidates examined, the Superpixel-driven State Space Model (SP-SSM) contribution shows one refutable candidate out of ten examined, suggesting some prior work on semantic-aware scanning exists within this limited search scope. The Multi-Scale Superpixel Mixture of Experts (MSS-MoE) contribution examined six candidates with none refutable, indicating relatively less overlap in multi-scale expert routing mechanisms. The overall SP-MoMamba framework examined two candidates with no refutations, though this small sample size limits definitive conclusions about novelty in the broader literature.

Based on the top-eighteen semantic matches examined, the work appears to introduce a distinctive combination of superpixel semantics and MoE routing within Mamba frameworks. The analysis covers a focused subset of the field rather than exhaustive prior work, and the taxonomy structure suggests this direction—semantic-aware scanning with adaptive expert selection—occupies a relatively underexplored niche within the moderately active Mamba-based super-resolution research area.

Taxonomy

Core-task Taxonomy Papers

Claimed Contributions

Contribution Candidate Papers Compared

Refutable Paper

Research Landscape Overview

Core task: Efficient image super-resolution using state space models. The field has rapidly evolved around leveraging state space models—particularly Mamba-based architectures—to achieve efficient super-resolution with linear complexity. The taxonomy reveals five main branches: Core State Space Model Architectures for Super-Resolution focuses on foundational Mamba frameworks like Mambair[1] and its successor Mambairv2[8], which establish baseline SSM designs for image restoration; Hybrid State Space and Alternative Architecture Integration explores combinations of SSMs with transformers or convolutional layers, as seen in MambaFormerSR[27] and ConvMambaSR[35]; Modality-Specific State Space Super-Resolution addresses specialized domains such as hyperspectral imaging (MambaHSISR[28]), light field data (LFMamba[5]), and event-based video (Event Video Mamba[3]); Degradation-Aware and Restoration-Focused SSM Methods targets real-world degradations and blind restoration scenarios; and Alternative Paradigms and Complementary Techniques encompasses non-SSM approaches and hybrid methods that provide context for SSM innovations. Recent work has concentrated on refining Mamba-based designs to balance efficiency and representational power, with many studies exploring directional scanning strategies, hierarchical feature aggregation, and frequency-domain enhancements. SP-MoMamba[0] sits within the Core State Space Model Architectures branch, specifically among Mamba-Based Super-Resolution Frameworks, where it shares conceptual ground with works like Mambair[1], Mambairv2[8], and S3SR[11]. While these neighbors emphasize various scanning patterns or multi-scale processing, SP-MoMamba[0] distinguishes itself by integrating mixture-of-experts mechanisms to adaptively route features, aiming to improve parameter efficiency without sacrificing reconstruction quality. This positions it as a natural evolution of core Mamba frameworks, addressing the trade-off between model capacity and computational cost that remains a central challenge across the field.

Claimed Contributions

Superpixel-driven State Space Model (SP-SSM)

Can Refute

10 retrieved papers

The authors introduce SP-SSM, which uses superpixel features as semantic units to restructure SSM input, effectively resolving semantic disruption issues inherent in traditional Mamba-based scanning methods that convert 2D images to 1D sequences.

10 retrieved papers

Can Refute

Multi-Scale Superpixel Mixture of State Space Experts (MSS-MoE)

6 retrieved papers

The authors propose MSS-MoE, a scheme that dynamically selects optimal SP-SSM experts across multiple scales, enabling comprehensive global modeling by leveraging complementary semantic information from different scale experts while reducing computational overhead.

6 retrieved papers

SP-MoMamba framework integrating superpixels with SSMs and MoE

2 retrieved papers

The authors develop SP-MoMamba, a complete framework that combines state space models with superpixel semantic preservation and mixture-of-experts efficiency, achieving superior super-resolution performance with minimal computational overhead through strategic integration of global and local experts.

2 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

[1] Mambair: A simple baseline for image restoration with state-space model PDF

Hang Guo, Li Jinmin, Tao Dai, Zhihao Ouyang, Xudong Ren, Shu-Tao Xia (2024)

[8] Mambairv2: Attentive state space restoration PDF

Hang Guo, Yong Guo, Yaohua Zha, Yulun Zhang, Wenbo Li, Tao Dai, Shu-Tao Xia, Yawei Li (2025)

[11] S3SR: Towards Efficient Image Super-Resolution with Selective State Space Model PDF

Pei Wang, Xiaotong Luo, Zekun Ai, Yanyun Qu (2025)

[14] Vmambair: Visual state space model for image restoration PDF

Yuan Shi, Xia Bin, Xiaoyu Jin, Bin Xia, Xing Wang, Tianyu Zhao, Xin Xia, Xuefeng Xiao, Wenming Yang (2025)

[17] Hi-Mamba: Hierarchical Mamba for Efficient Image Super-Resolution PDF

Qiao, Junbo, Junbo Qiao, Liao Jin-cheng, Jincheng Liao, Li Wei, Wei Li, Zhang, Yulun, Yulun Zhang, Guo Yong, Yong Guo, Wen Yi, Yi Wen, Zhicheng Qiu, Xie Jiao, Jiao Xie, Zhangxizi Qiu, Hu Jie, Jie Hu, Lin, Shaohui, Shaohui Lin (2024)

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

Superpixel-driven State Space Model (SP-SSM)

[60] Superpixel-Integrated Dual-Stage Mamba for Hyperspectral Image Classification PDF

Can Refute

[58] SSMamba: Superpixel Segmentation With Mamba PDF

Cannot Refute

[59] A New Multiscale Superpixel Mamba for Hyperspectral Image Classification PDF

Cannot Refute

[61] TEBS: TemperatureâEmissivityâDriven band selection for thermal infrared hyperspectral image classification with structured State-Space model and gated â¦ PDF

Cannot Refute

[62] Hybrid Quantum Deep Learning With Superpixel Encoding for Earth Observation Data Classification PDF

Cannot Refute

[63] Superpixel Graph Contrastive Clustering With Semantic-Invariant Augmentations for Hyperspectral Images PDF

Cannot Refute

[64] Superpixel Transformers for Efficient Semantic Segmentation PDF

Cannot Refute

[65] APNet: A Novel Anti-Perturbation Network for Robust Hyperspectral Image Classification against Adversarial Attacks PDF

Cannot Refute

[66] HieraASGSegNet: Hierarchical Context Fusion for Semantic Segmentation via Adaptive Superpixel Graph Reasoning PDF

Cannot Refute

[67] Graph-MambaRoadDet: A Symmetry-Aware Dynamic Graph Framework for Road Damage Detection PDF

Cannot Refute

Contribution

Multi-Scale Superpixel Mixture of State Space Experts (MSS-MoE)

[51] Interactive multiple instance learning network for whole slide image analysis PDF

Cannot Refute

[52] FAMHE-Net: Multi-Scale Feature Augmentation and Mixture of Heterogeneous Experts for Oriented Object Detection PDF

Cannot Refute

[53] Remote Sensing Large Vision-Language Model: Semantic-augmented Multi-level Alignment and Semantic-aware Expert Modeling PDF

Cannot Refute

[54] COME: Dual Structure-Semantic Learning with Collaborative MoE for Universal Lesion Detection Across Heterogeneous Ultrasound Datasets PDF

Cannot Refute

[55] One Prompt is not Enough: Automated Construction of a Mixture-of-Expert Prompts PDF

Cannot Refute

[56] Curved Spaces, Enhanced Diagnosis: Hyperbolic Neural Networks for Multi-Label ECG Classification PDF

Cannot Refute

Contribution

SP-MoMamba framework integrating superpixels with SSMs and MoE

[2] QMambaBSR: Burst Image Super-Resolution with Query State Space Model PDF

Cannot Refute

[57] Stfnet: Spatio-Temporal-Frequency Difference Enhancement Network with Fourier Transform for Remote Sensing Images Change Detection PDF

Cannot Refute

SP-MoMamba: Superpixel-driven Mixture of State Space Experts for Efficient Image Super-Resolution

Overview

Overall Novelty Assessment

Taxonomy

Research Landscape Overview

Claimed Contributions

Core Task Comparisons

[1] Mambair: A simple baseline for image restoration with state-space model PDF

[8] Mambairv2: Attentive state space restoration PDF

[11] S3SR: Towards Efficient Image Super-Resolution with Selective State Space Model PDF

[14] Vmambair: Visual state space model for image restoration PDF

[17] Hi-Mamba: Hierarchical Mamba for Efficient Image Super-Resolution PDF

Contribution Analysis

Superpixel-driven State Space Model (SP-SSM)

[60] Superpixel-Integrated Dual-Stage Mamba for Hyperspectral Image Classification PDF

[58] SSMamba: Superpixel Segmentation With Mamba PDF

[59] A New Multiscale Superpixel Mamba for Hyperspectral Image Classification PDF

[61] TEBS: TemperatureâEmissivityâDriven band selection for thermal infrared hyperspectral image classification with structured State-Space model and gated â¦ PDF

[62] Hybrid Quantum Deep Learning With Superpixel Encoding for Earth Observation Data Classification PDF

[63] Superpixel Graph Contrastive Clustering With Semantic-Invariant Augmentations for Hyperspectral Images PDF

[64] Superpixel Transformers for Efficient Semantic Segmentation PDF

[65] APNet: A Novel Anti-Perturbation Network for Robust Hyperspectral Image Classification against Adversarial Attacks PDF

[66] HieraASGSegNet: Hierarchical Context Fusion for Semantic Segmentation via Adaptive Superpixel Graph Reasoning PDF

[67] Graph-MambaRoadDet: A Symmetry-Aware Dynamic Graph Framework for Road Damage Detection PDF

Multi-Scale Superpixel Mixture of State Space Experts (MSS-MoE)

[51] Interactive multiple instance learning network for whole slide image analysis PDF

[52] FAMHE-Net: Multi-Scale Feature Augmentation and Mixture of Heterogeneous Experts for Oriented Object Detection PDF

[53] Remote Sensing Large Vision-Language Model: Semantic-augmented Multi-level Alignment and Semantic-aware Expert Modeling PDF

[54] COME: Dual Structure-Semantic Learning with Collaborative MoE for Universal Lesion Detection Across Heterogeneous Ultrasound Datasets PDF

[55] One Prompt is not Enough: Automated Construction of a Mixture-of-Expert Prompts PDF

[56] Curved Spaces, Enhanced Diagnosis: Hyperbolic Neural Networks for Multi-Label ECG Classification PDF

SP-MoMamba framework integrating superpixels with SSMs and MoE

[2] QMambaBSR: Burst Image Super-Resolution with Query State Space Model PDF

[57] Stfnet: Spatio-Temporal-Frequency Difference Enhancement Network with Fourier Transform for Remote Sensing Images Change Detection PDF

Table of Contents

[61] TEBS: TemperatureâEmissivityâDriven band selection for thermal infrared hyperspectral image classification with structured State-Space model and gated â¦ PDF