SP-MoMamba: Superpixel-driven Mixture of State Space Experts for Efficient Image Super-Resolution

ICLR 2026 Conference SubmissionAnonymous Authors
Low-level vision; Efficient Image Super-resolution; State Space Model
Abstract:

The state space model (SSM) has garnered significant attention recently due to its exceptional long-range modeling capabilities achieved with linear-time complexity, enabling notable success in efficient super-resolution. However, applying SSMs to vision tasks typically requires scanning 2D visual data with a 1D-sequence form, which disrupts inherent semantic relationships and introduces artifacts and distortions during image restoration. To address these challenges, we propose a novel SP-MoMamba method that integrates SSMs with the semantic preservation capability of superpixels and the efficiency advantage of Mixture-of-Experts (MoE). Specifically, we pioneer the use of superpixel features as semantic units to reconstruct the SSM scanning method, proposing the Superpixel-driven State Space Model (SP-SSM) as a basic building block of SP-MoMamba. Furthermore, we introduce the Multi-Scale Superpixel Mixture of State Space Experts (MSS-MoE) scheme to strategically integrate SP-SSMs across scales, effectively harnessing the complementary semantic information from multiple experts. This multi-scale expert integration significantly reduces the number of pixels processed by each SSM while enhancing the reconstruction of fine details through specialized experts operating at different semantic scales. This framework enables our model to deliver superior performance with minimal computational overhead.

Disclaimer
This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.
NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.
If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper proposes SP-MoMamba, which integrates superpixel-driven scanning with Mixture-of-Experts (MoE) routing in state space models for efficient super-resolution. It resides in the 'Mamba-Based Super-Resolution Frameworks' leaf, which contains six papers including Mambair, Mambairv2, and S3SR. This leaf represents a moderately populated research direction within the broader taxonomy of fifty papers across thirty-six topics, indicating active but not overcrowded exploration of foundational Mamba architectures for super-resolution tasks.

The taxonomy reveals that SP-MoMamba sits within 'Core State Space Model Architectures for Super-Resolution,' adjacent to branches exploring hybrid integration (Mamba-Transformer, Mamba-CNN) and modality-specific methods (hyperspectral, light field, video). Neighboring leaves include 'State-Control and Predictive Modeling' and 'Efficient and Lightweight SSM Designs,' which address complementary concerns of dynamic control and parameter reduction. The paper's superpixel-based scanning diverges from typical directional or hierarchical strategies seen in sibling works, while its MoE integration connects conceptually to efficiency-focused branches without crossing into hybrid architectures.

Among eighteen candidates examined, the Superpixel-driven State Space Model (SP-SSM) contribution shows one refutable candidate out of ten examined, suggesting some prior work on semantic-aware scanning exists within this limited search scope. The Multi-Scale Superpixel Mixture of Experts (MSS-MoE) contribution examined six candidates with none refutable, indicating relatively less overlap in multi-scale expert routing mechanisms. The overall SP-MoMamba framework examined two candidates with no refutations, though this small sample size limits definitive conclusions about novelty in the broader literature.

Based on the top-eighteen semantic matches examined, the work appears to introduce a distinctive combination of superpixel semantics and MoE routing within Mamba frameworks. The analysis covers a focused subset of the field rather than exhaustive prior work, and the taxonomy structure suggests this direction—semantic-aware scanning with adaptive expert selection—occupies a relatively underexplored niche within the moderately active Mamba-based super-resolution research area.

Taxonomy

Core-task Taxonomy Papers
50
3
Claimed Contributions
18
Contribution Candidate Papers Compared
1
Refutable Paper

Research Landscape Overview

Core task: Efficient image super-resolution using state space models. The field has rapidly evolved around leveraging state space models—particularly Mamba-based architectures—to achieve efficient super-resolution with linear complexity. The taxonomy reveals five main branches: Core State Space Model Architectures for Super-Resolution focuses on foundational Mamba frameworks like Mambair[1] and its successor Mambairv2[8], which establish baseline SSM designs for image restoration; Hybrid State Space and Alternative Architecture Integration explores combinations of SSMs with transformers or convolutional layers, as seen in MambaFormerSR[27] and ConvMambaSR[35]; Modality-Specific State Space Super-Resolution addresses specialized domains such as hyperspectral imaging (MambaHSISR[28]), light field data (LFMamba[5]), and event-based video (Event Video Mamba[3]); Degradation-Aware and Restoration-Focused SSM Methods targets real-world degradations and blind restoration scenarios; and Alternative Paradigms and Complementary Techniques encompasses non-SSM approaches and hybrid methods that provide context for SSM innovations. Recent work has concentrated on refining Mamba-based designs to balance efficiency and representational power, with many studies exploring directional scanning strategies, hierarchical feature aggregation, and frequency-domain enhancements. SP-MoMamba[0] sits within the Core State Space Model Architectures branch, specifically among Mamba-Based Super-Resolution Frameworks, where it shares conceptual ground with works like Mambair[1], Mambairv2[8], and S3SR[11]. While these neighbors emphasize various scanning patterns or multi-scale processing, SP-MoMamba[0] distinguishes itself by integrating mixture-of-experts mechanisms to adaptively route features, aiming to improve parameter efficiency without sacrificing reconstruction quality. This positions it as a natural evolution of core Mamba frameworks, addressing the trade-off between model capacity and computational cost that remains a central challenge across the field.

Claimed Contributions

Superpixel-driven State Space Model (SP-SSM)

The authors introduce SP-SSM, which uses superpixel features as semantic units to restructure SSM input, effectively resolving semantic disruption issues inherent in traditional Mamba-based scanning methods that convert 2D images to 1D sequences.

10 retrieved papers
Can Refute
Multi-Scale Superpixel Mixture of State Space Experts (MSS-MoE)

The authors propose MSS-MoE, a scheme that dynamically selects optimal SP-SSM experts across multiple scales, enabling comprehensive global modeling by leveraging complementary semantic information from different scale experts while reducing computational overhead.

6 retrieved papers
SP-MoMamba framework integrating superpixels with SSMs and MoE

The authors develop SP-MoMamba, a complete framework that combines state space models with superpixel semantic preservation and mixture-of-experts efficiency, achieving superior super-resolution performance with minimal computational overhead through strategic integration of global and local experts.

2 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

Superpixel-driven State Space Model (SP-SSM)

The authors introduce SP-SSM, which uses superpixel features as semantic units to restructure SSM input, effectively resolving semantic disruption issues inherent in traditional Mamba-based scanning methods that convert 2D images to 1D sequences.

Contribution

Multi-Scale Superpixel Mixture of State Space Experts (MSS-MoE)

The authors propose MSS-MoE, a scheme that dynamically selects optimal SP-SSM experts across multiple scales, enabling comprehensive global modeling by leveraging complementary semantic information from different scale experts while reducing computational overhead.

Contribution

SP-MoMamba framework integrating superpixels with SSMs and MoE

The authors develop SP-MoMamba, a complete framework that combines state space models with superpixel semantic preservation and mixture-of-experts efficiency, achieving superior super-resolution performance with minimal computational overhead through strategic integration of global and local experts.