

ବାର୍ତ୍ତାଳାପ (Bartālāpa)
Building Odia Speech Technology for the People

Problem Statement
Odia remains an underrepresented language in conversational AI, with existing speech systems struggling to handle strong dialectal variations, regional accents, and noisy real-world speech data. Most available Odia speech datasets are limited in scale, biased toward standard forms, and recorded in controlled environments, leading to poor performance in natural conversations. As a result, Odia speakers experience inaccurate recognition, unnatural responses, and limited access to voice-based AI technologies.
Solution
ବାର୍ତ୍ତାଳାପ ( Bartālāpa) is a community-driven initiative to collect, curate, and process Odia speech data for developing advanced speech technologies.
The project focuses on building robust models for speech recognition, text-to-speech, and conversational AI, making Odia language technologies accessible to the common people while preserving linguistic and cultural heritage.
Scope
The scope of ବାର୍ତ୍ତାଳାପ (Bartalapa) is centered on Odia speech dataset preparation and ASR research, with a particular emphasis on dialectal variation and noisy real-world speech.
The project includes:
- Collection and curation of diverse Odia speech data covering multiple regional dialects, accents, speaking styles, and background noise conditions
- Annotation and quality control of speech transcripts with rich metadata (dialect, region, noise type, recording setup)
- Design and benchmarking of Odia ASR systems, including dialect-aware and noise-robust models, to study performance gaps and mitigation strategies
- Dataset standardization and documentation to support reproducibility and broader research use
As one component of the project, selected datasets and evaluation protocols will be adapted for participation in international shared tasks (e.g., IWSLT-style low-resource or dialectal ASR tracks), enabling external benchmarking and community engagement.
Ongoing Speech Data Collection & Annotation
We are currently conducting ongoing data collection and annotation of Odia speech, targeting high-quality transcribed speech for use in a shared task setting.
To volunteer, contact Anshuman (anshumanmishra274@gmail.com) for annotation guidelines and process details.


Team
Researcher and Volunteers Contributing to the Project

Sushanta Mishra
Volunteer
OdiaGenAI Developed Speech Processing System
ASR
Contact
Feel free to reach out to us with any questions about the project or collaboration opportunities.












