Meta on Monday, November 10, unveiled a suite of open-weight AI models with automatic speech recognition (ASR) capabilities that support more than 1,600 languages worldwide, including 500 “low-resource languages” that have been transcribed using artificial intelligence (AI) for the first time.The Omnilingual ASR models have been developed by Meta’s Fundamental AI Research (FAIR) team. The social media giant has also introduced an open-weight, multilingual speech representation model known as Omnilingual wav2vec 2.0 that can be scaled up to seven billion parameters, enabling developers to build a wide range of AI-driven speech applications.
The Indian languages supported by Meta’s ASR models include Hindi, Marathi, Malayalam, Tulu, Telugu, Odia, Punjabi, Marwari, Urdu etc. Notably, the models are also capable of transcribing a range of long-tail Indian languages that are less widely spoken in the country, such as Kui, Chattisgarhi, Maithili, Bagheli, Mahasu Pahari, Awadhi, Rajbanshi, among others.
Story continues below this ad
Additionally, Meta has made its Omnilingual ASR Corpus of transcribed speech in 350 underserved languages publicly available.
Meta Omnilingual ASR expands speech recognition to 1,600+ languages, including 500 never before supported, as a major step towards truly universal AI.
We are open-sourcing a full suite of models and a dataset: https://t.co/AIaYrqSF0h https://t.co/qC79jrF7BY
— Alexandr Wang (@alexandr_wang) November 10, 2025
The announcement comes as Indian AI startups race to develop Indic language models, bolstered by government-backed initiatives such as Mission Bhashini that aims to advance local language AI innovation in the country.
However, startups developing large language models (LLMs) using datasets sourced under the Bhashini AI mission face stiff competition from AI giants such as Meta and OpenAI, which seek to strengthen their foothold in India since it is one of their key growth markets.
The lack of high-quality training datasets is, however, a challenge for most players because long-tail languages are not well represented on the internet. “This means high-quality transcriptions are often unavailable for speakers of less widely represented or low-resource languages, furthering the digital divide,” Meta said in a blog post, adding that current AI model architectures cannot be scaled universally as they are resource intensive.
This is why Meta has designed its Omnilingual ASR to be community-driven by allowing users to add new languages to the framework by feeding the models a few of their own samples. “In practice, this means that a speaker of an unsupported language can provide only a handful of paired audio-text samples and obtain usable transcription quality — without training data at scale, onerous expertise, or access to high-end compute,” the company said.Story continues below this ad
Omnilingual wav2vec 2.0
Dubbed LLM-ASR, Meta’s new self-supervised multilingual speech representation model has been released under a permissive Apache 2.0 license. “First, we scaled our previous wav2vec 2.0 speech encoder to 7B parameters for the first time, producing rich, massively multilingual semantic representations from raw, untranscribed speech data,” Meta said.
“We then built two decoder variants to map those into character tokens. The first decoder relies on a traditional connectionist temporal classification (CTC) objective, while the second leverages a traditional transformer decoder, commonly used in LLMs,” it added.
In terms of performance, the LLM-ASR model recorded character error rates (CER) below 10 for 78 per cent of the more than 1,600 languages supported under its Omnilingual ASR efforts.
Omnilingual ASR Corpus
Meta said that its publicly available Omnilingual ASR Corpus comprising transcribed speech in 350 underserved languages was compiled in partnership with “with local organisations that recruited and compensated native speakers, often in remote or under-documented regions.”Story continues below this ad
The company also said it worked with a group of linguists, researchers, and language communities by collaborating with organisations such as the Mozilla Foundation’s Common Voice initiative that works directly with local communities. This corpus of data has been released under the CC-BY license, enabling researchers and developers to use it for building AI-powered speech applications.
In September this year, reports said that Meta was looking to develop AI-powered, role-playing chatbots in Hindi by working with third-party contractors to help customise the chatbots with more cultural nuances.
The company has reportedly hired US-based contractors to collaborate with local residents in India and other key markets such as Indonesia and Mexico, who are tasked with providing creative direction and tailoring character-driven chatbots to help make them feel more authentic, as per Business Insider.
Source link
