DAIT: Dual Aggregation Interaction Transformer for Image Deraining
*Equal Contribution, †Corresponding authors
-
arXiv
-
PDF
-
Code
Abstract
Transformer-based networks leveraging spatial or channel self-attention have achieved remarkable performance in image deraining tasks, inspiring us to utilize spatial and channel dimensions simultaneously. However, single-architecture feature representations struggle to handle real-world rain variations, making rich global-local information essential. We propose the Dual Aggregation Interaction Transformer (DAIT), leveraging the strengths of two paradigms (spatial-channel and globallocal) while enabling deep intra- and inter-paradigm interaction and fusion. Specifically, we propose two types of self-attention: Sparse Prompt Channel Self-Attention (SPC-SA) and Spatial Pixel Refinement Self-Attention (SPR-SA). SPC-SA uses dynamic sparsity to enhance global dependencies and facilitate channel context learning. SPR-SA emphasizes spatially varying rain distribution and local texture restoration. To address knowledge disparity, we introduce the Frequency Adaptive Interaction Module (FAIM) to eliminate feature isolation within and between paradigms progressively. Additionally, we employ a Multi-Scale Flow Gating Network (MSGN) to extract scale-aware features. Extensive experiments demonstrate DAIT achieves state-of-the-art performance on six benchmark datasets.
Overall Pipeline

Quantitative results





Visualization








Bibtex
@inproceedings{zou2025dait,
title={DAIT: Dual Aggregation Interaction Transformer for Image Deraining},
author={Zou, Shun and Zou, Yi and Chen, Zhihao and Gao, Guangwei and Qi, Guojun},
booktitle={},
year={2025}
}