Vision-Language-Action Models for Sewer Inspection Robotics in India
Abstract
We present SafAI, India's first indigenously trained vision-language-action (VLA) model designed for autonomous sewer inspection in non-standardized underground infrastructure. Manual scavenging, despite being legally prohibited since 2013, continues to claim over 400 lives annually in India. Existing robotic solutions, designed for standardized Western sewer systems, fail within meters of deployment in Indian conditions characterized by irregular brick-lined tunnels, open drains, and unmapped septic tanks. SafAI addresses this gap through a compact VLA architecture (SmolVLA-SewerBot) trained on field data collected from Indian sanitation workers wearing sensor arrays under an ethical consent framework. The model achieves autonomous navigation, blockage assessment, sludge extraction, and material deposition on battery-powered hardware costing under INR 2 lakh. We report results from simulation in Isaac Sim with domain randomization calibrated to Indian sewer geometry, and from initial field trials in municipal drain systems. The paper also describes our data collection methodology, which compensates participating workers and ensures informed consent at every stage.
Keywords
Citation
Chanda, S. (2026). "Vision-Language-Action Models for Sewer Inspection Robotics in India." Saral Systems Council Working Paper SSC-WP-2026-002. DOI: 10.xxxx/ssc-wp-2026-002
Related Research
More from Robotics
SmolVLA-SewerBot: A Compact Vision-Language-Action Model for Autonomous Sewer Inspection in Resource-Constrained Environments
Sayonsom Chanda
Embodiment-Aware Intent Transfer for Hazardous Labour Substitution: From Simulation to Indian Field Conditions
Sayonsom Chanda
Adaptive Precision Inference for Battery-Powered Field Robots in Infrastructure-Poor Environments
Sayonsom Chanda