Alloy Das (অলয়)

PhD Student · Iowa State University · Advised by Prof. Soumik Sarkar

prof_pic.jpg

Ames, Iowa, USA

alloyuit@gmail.com

I am a PhD student in the Department of Mechanical Engineering at Iowa State University, working in the SCSLab under the supervision of Prof. Soumik Sarkar.

My research lies at the intersection of computer vision, multi-modal representation learning, and agricultural AI. I am currently working on:

  • EmbodiedMAE — a multi-modal masked autoencoder for 3D plant reconstruction from RGB images, depth maps, and point clouds, targeting Sorghum phenotyping.
  • Lighting-robust instance segmentation — extending SAM with a custom Lighting Convolutional Attention (LCA) module for robust segmentation under challenging illumination conditions.

Previously, I was a Research Assistant at the Computer Vision and Pattern Recognition Unit (CVPRU), Indian Statistical Institute, Kolkata, supervised by Prof. Umapada Pal. My work there focused on scene text spotting, recognition, and editing — resulting in publications at WACV 2024, WACV 2025, ICRA 2024, and ICPR 2024.

I am a peer reviewer for The Visual Computer journal.

selected publications

  1. Tricho-Vision: The use of computer vision in trichotaxonomy for enhancing wildlife conservation of priority species
    Alloy Das, Priyanka Banerjee, Sanket Biswas, Manokaran Kamalakannan, Joydev Chattopadhyay, Dhriti Banerjee, and Tanoy Mukherjee
    Ecological Informatics, 2025
  2. FASTER: A Font-Agnostic Scene Text Editing and Rendering Framework
    Alloy Das, Sanket Biswas, Prasun Roy, Subhankar Ghosh, Umapada Pal, Michael Blumenstein, Josep Lladós, and Saumik Bhattacharya
    2025
  3. FastTextSpotter: A High-Efficiency Transformer for Multilingual Scene Text Spotting
    Alloy Das, Sanket Biswas, Umapada Pal, Josep Lladós, and Saumik Bhattacharya
    Lecture notes in computer science, 2024
  4. Diving into the Depths of Spotting Text in Multi-Domain Noisy Scenes
    Alloy Das, Sanket Biswas, Umapada Pal, and Josep Lladós
    2024
  5. Harnessing the Power of Multi-Lingual Datasets for Pre-training: Towards Enhancing Text Spotting Performance
    Alloy Das, Sanket Biswas, Ayan Banerjee, Josep Lladós, Umapada Pal, and Saumik Bhattacharya
    2024