Alloy Das (অলয়)
PhD Student · Iowa State University · Advised by Prof. Soumik Sarkar
Ames, Iowa, USA
I am a PhD student in the Department of Mechanical Engineering at Iowa State University, working in the SCSLab under the supervision of Prof. Soumik Sarkar.
My research lies at the intersection of computer vision, multi-modal representation learning, and agricultural AI. I am currently working on:
- EmbodiedMAE — a multi-modal masked autoencoder for 3D plant reconstruction from RGB images, depth maps, and point clouds, targeting Sorghum phenotyping.
- Lighting-robust instance segmentation — extending SAM with a custom Lighting Convolutional Attention (LCA) module for robust segmentation under challenging illumination conditions.
Previously, I was a Research Assistant at the Computer Vision and Pattern Recognition Unit (CVPRU), Indian Statistical Institute, Kolkata, supervised by Prof. Umapada Pal. My work there focused on scene text spotting, recognition, and editing — resulting in publications at WACV 2024, WACV 2025, ICRA 2024, and ICPR 2024.
I am a peer reviewer for The Visual Computer and Scientific Reports journals.
News
| May 25, 2026 | 🎉 NoTeS-Bank has been accepted at ECML PKDD 2026! [arXiv] |
|---|---|
| May 22, 2026 | 🚀 Our paper Lighting-aware Unified Model for Instance Segmentation is now live on arXiv! |
| Aug 01, 2025 | 🎓 Started my PhD at Iowa State University, advised by Prof. Soumik Sarkar. |
| Feb 01, 2025 | 📄 FASTER: A Font-Agnostic Scene Text Editing and Rendering Framework accepted at WACV 2025! |
| Dec 01, 2024 | 📄 FastTextSpotter accepted at ICPR 2024! |
Research at a Glance
Publication Timeline
{
"tooltip": { "trigger": "axis" },
"grid": { "left": "5%", "right": "5%", "bottom": "10%", "containLabel": true },
"xAxis": {
"type": "category",
"data": ["2021", "2022", "2024", "2025", "2026"],
"axisLabel": { "color": "#666" }
},
"yAxis": {
"type": "value",
"name": "Papers",
"minInterval": 1,
"axisLabel": { "color": "#666" }
},
"series": [
{
"name": "Publications",
"type": "bar",
"barMaxWidth": 40,
"data": [1, 2, 5, 7, 1],
"itemStyle": {
"color": {
"type": "linear",
"x": 0, "y": 0, "x2": 0, "y2": 1,
"colorStops": [
{ "offset": 0, "color": "#4f8ef7" },
{ "offset": 1, "color": "#7fcfe8" }
]
},
"borderRadius": [4, 4, 0, 0]
},
"label": { "show": true, "position": "top" }
}
]
}
Research Skills
{
"tooltip": {},
"radar": {
"indicator": [
{ "name": "Computer Vision", "max": 10 },
{ "name": "Deep Learning", "max": 10 },
{ "name": "Multi-modal Learning", "max": 10 },
{ "name": "Scene Text Spotting", "max": 10 },
{ "name": "Agricultural AI", "max": 10 },
{ "name": "3D Reconstruction", "max": 10 }
],
"radius": "65%"
},
"series": [
{
"type": "radar",
"data": [
{
"value": [9, 9, 8, 9, 7, 6],
"name": "Expertise",
"areaStyle": { "opacity": 0.3 },
"lineStyle": { "color": "#4f8ef7", "width": 2 },
"itemStyle": { "color": "#4f8ef7" }
}
]
}
]
}
Publication Venues
{
"tooltip": { "trigger": "item", "formatter": "{b}: {c} papers ({d}%)" },
"legend": {
"orient": "vertical",
"right": "5%",
"top": "center"
},
"series": [
{
"type": "pie",
"radius": ["35%", "60%"],
"center": ["38%", "50%"],
"avoidLabelOverlap": true,
"itemStyle": { "borderRadius": 6, "borderColor": "#fff", "borderWidth": 2 },
"label": { "show": false },
"emphasis": {
"label": { "show": true, "fontSize": 13, "fontWeight": "bold" }
},
"data": [
{ "value": 4, "name": "WACV / ICRA / ICPR", "itemStyle": { "color": "#4f8ef7" } },
{ "value": 3, "name": "Journals (KBS / MTA / Eco. Inf.)", "itemStyle": { "color": "#7fcfe8" } },
{ "value": 4, "name": "ICDAR / LNCS", "itemStyle": { "color": "#5cc88a" } },
{ "value": 3, "name": "Preprints / Workshops", "itemStyle": { "color": "#f7a64f" } },
{ "value": 2, "name": "AIP / IJPRAI", "itemStyle": { "color": "#e87c7c" } }
]
}
]
}