
MambaMIC: An Efficient Baseline for Microscopic Image Classification with State Space Models
*Equal Contribution, †Corresponding authors
-
arXiv
-
PDF
-
Code
Abstract
In recent years, CNN and Transformer-based methods have made significant progress in Microscopic Image Classification (MIC). However, existing approaches still face the dilemma between global modeling and efficient computation. While the Selective State Space Model (SSM) can simulate long-range dependencies with linear complexity, it still encounters challenges in MIC, such as local pixel forgetting, channel redundancy, and lack of local perception. To address these issues, we propose a simple yet efficient vision backbone for MIC tasks, named MambaMIC. Specifically, we introduce a Local-Global dualbranch aggregation module: the MambaMIC Block, designed to effectively capture and fuse local connectivity and global dependencies. In the local branch, we use local convolutions to capture pixel similarity, mitigating local pixel forgetting and enhancing perception. In the global branch, SSM extracts global dependencies, while Locally Aware Enhanced Filter reduces channel redundancy and local pixel forgetting. Additionally, we design a Feature Modulation Interaction Aggregation Module for deep feature interaction and key feature re-localization. Extensive benchmarking shows that MambaMIC achieves state-of-the-art performance across five datasets.
Pipeline of MambaMIC

Quantitative results

Qualitative visualization

Ablation Study

Bibtex
@article{zou2024microscopic,
title={MambaMIC: An Efficient Baseline for Microscopic Image Classification with State Space Models},
author={Zou, Shun and Zhang, Zhuo and Zou, Yi and Gao, Guangwei},
journal={ICME},
year={2025}
}