The Opportunities & Challenges for Vernacular Content in India

Image Courtesy: Anubha Upadhya M


The story of India’s internet has Facebook, Twitter, and Google as its protagonists since more than a decade. But the role of these players has minimized as the world is now attempting to take the internet to the next billion users of the internet in the country.  

The people in the second-tier and third-tier cities in India—frequently referred as ‘Bharat’—rely on the use of their vernacular languages to make an entry into the digital world.

Unlike what appears, the Indian users hunt for progressive information. They are aspirational, argumentative, curious, and love to stay updated using local networks and WhatsApp forwards.

But what they are is not important. Important is what they are not—they are not English-speakers.

Vernacular language startups

The potential to reach out to these users has given way to the growth of local language start-ups. There exists a big space for acquiring this local user-territory by bringing online community groups, social media apps, knowledge forums, etc. in their native languages.  

Some of the local-language startups include Sharechat, with 8 million daily active users, Roposo with 11 million users, Helo (launched by China’s Toutiao) with over 10 million installs. These apps continue amassing users on a daily basis.

The space is also attracting foreign giants with Google launching its vernacular initiative Google Navlekha and Amazon developing tools to improve Alexa in picking regional commands.

Opportunity for the Local Apps

India is known for its knowledge and wisdom for ages. A large part of the knowledge of India still lies within the minds of its people and they’ve not had a real opportunity to share it.

India needs to create the easiest way for people to come and share this knowledge with millions. With over 12 official languages and 6,000 dialects, merely 12% population in India is well-versed in English language. Since very few Indian apps and websites offer content in multiple languages, the masses have to depend on the prowess of Google translation to make use of the information available online in English. But there, they come across wrong translations and erratic sentences quite often.

As per the numbers nearly 6% of the world population speaks English, while over 18% of the global population speaks at least one Indian language.

This means that India has thrice the number of the world’s English-speaking population. Also, the non-English speaking segment of India is almost double the size of the United States.

 With such a large user base, they deserve an internet that they can use and enjoy without the need to rely on erratic translations on the web.

The service providers shall also bear in mind that text is not the dominant format of consumption when they try to move beyond Tier 1 cities. Indians have an inclination for the visual medium. While the largest newspaper has just 2% readership in the country, over 25% of India’s population watches YouTube & television. The audio and video formats are friendlier.

Keeping the next billion in mind, any vernacular product shall cater to multiple preferences and multiple segments. A homogenous offering will not survive here.

What vernacular players lack?

Though the Indian apps are rapidly rising, but these social networks have demonstrated no improved sensitivity in dealing with issues like abuse, misinformation campaigns, fake news, hoaxes, etc.

Another challenge is that these platforms let the users to use the platform in a certain language, but do not outline the “terms and conditions” to use the platform in the language that people can understand. People can pick a preferred language from the landing page, create their account, but the link that would direct them to the terms-page always loads in English. Understanding the T&Cs is important as it educates about the copyright issues, the kind of content that can be posted, the responsibility for spreading flammable content, etc.

The founders never care about the negative outcomes that might transpire out of their products. They seem to have a sole aim to become the de-facto social network. They fail to understand that social network is a beast. And it is very difficult to limit it once a few bad actors get in and begin misusing it.

It is perhaps the reason why facebook, twitter are coiled in some or the other controversies almost every week.

Founders must address the concept of ‘user education’ to establish the responsible practice of using social media. They cannot escape when the menace spills by saying “we are just a platform” or dismiss the incendiary content in the name of free speech.

Major challenge: Content moderation

Content moderation is of paramount importance to prevent, detect, and report the explosive material on the platform. But the content moderation on local language apps is a daunting task on the local language platforms.

It is because the English platforms like Twitter, Facebook , are making use of AI to monitor the posted content and to remove porn, spam, and fake accounts. Since there is absolutely no reliable data in local languages that can be used to develop the AI-algorithms, such platforms fail miserably in detecting hate-speech, provocative topics, and sensitive content.

Another approach to content moderation is to have content moderators (adept in local languages) who may classify all the images and posts by going through them. But the startups cannot afford the system of manual filtering at their limited budgets. Monitoring requires huge manpower and budget. Facebook is reported to have 7500+ people for content moderation alone.