Cross Paradigm Representation and Alignment Transformer for Image Deraining

Shun Zou , Yi Zou , Juncheng Li , Guangwei Gao ^†, Guojun Qi ,

^†Corresponding authors

arXiv

PDF

Code

Abstract

Transformer-based networks leveraging spatial or channel self-attention have achieved remarkable performance in image deraining tasks, inspiring us to utilize spatial and channel dimensions simultaneously. However, single-architecture feature representations struggle to handle real-world rain variations, making rich global-local information essential. We propose the Dual Aggregation Interaction Transformer (DAIT), leveraging the strengths of two paradigms (spatial-channel and globallocal) while enabling deep intra- and inter-paradigm interaction and fusion. Specifically, we propose two types of self-attention: Sparse Prompt Channel Self-Attention (SPC-SA) and Spatial Pixel Refinement Self-Attention (SPR-SA). SPC-SA uses dynamic sparsity to enhance global dependencies and facilitate channel context learning. SPR-SA emphasizes spatially varying rain distribution and local texture restoration. To address knowledge disparity, we introduce the Frequency Adaptive Interaction Module (FAIM) to eliminate feature isolation within and between paradigms progressively. Additionally, we employ a Multi-Scale Flow Gating Network (MSGN) to extract scale-aware features. Extensive experiments demonstrate DAIT achieves state-of-the-art performance on six benchmark datasets.

Motivation

Overall Pipeline

Quantitative results

Visualization

Bibtex

      
@inproceedings{zou2025cpraformer,
  title={Cross Paradigm Representation and Alignment Transformer for Image Deraining},
  author={Zou, Shun and Zou, Yi and Li, Juncheng  and Gao, Guangwei and Qi, Guojun},
  booktitle={},
  year={2025}
}