FastTextSpotter

High-Efficiency Transformer for Multilingual Scene Text Spotting

FastTextSpotter is an end-to-end transformer-based scene text spotting model designed for high efficiency and multilingual robustness. Published at WACV 2024.

Key Contributions

  • Lightweight deformable attention backbone for fast text region proposals
  • Unified detection + recognition head trained end-to-end
  • Pre-training leveraging multilingual synthetic datasets for cross-lingual transfer
  • Strong speed-accuracy trade-off on Total-Text, CTW1500, and MLT benchmarks

Publication

Das, A., Biswas, S., Pal, U., Lladós, J., Bhattacharya, S. FastTextSpotter: A High-Efficiency Transformer for Multilingual Scene Text Spotting. WACV 2024.

Work done at CVPRU, Indian Statistical Institute Kolkata.