Rohini Srivathsa, National Technology Officer, Microsoft India, is responsible for strategic initiatives to accelerate digital transformation across industries and government. In an interview with Shivani Shinde, she talks about how artificial intelligence (AI) is impacting language translation, and how Microsoft is reducing the complexity of Indian languages for machines and the need to get serious about responsible AI. Edited excerpts:
What are some of the breakthroughs in AI, in both the Indian and global contexts?
AI in language. Especially if we see the natural language processing area, it is one space which has seen an explosion of research breakthroughs and has been gaining traction. I would add to that the GPT3 kind of very large-scale language models, in which Microsoft also has a partnership with OpenAI, because these language models have the capability to even synthesise, or create text from a given corpus. So it’s not just about translation, but also about knowledge mining, cognitive search, and the ability to create very different levels of interaction with knowledge.
The second big breakthrough is responsible AI. This is important because it goes specifically into topics around privacy, confidential computing, and fairness, and that needs a lot of schooling as well as governance and principles to be able to put it into practice.
The third is autonomous systems. It is in the early stages but it is gaining traction. In India, some use-cases for it could be in the environmental disaster area.
Mark Zuckerberg recently said, while talking about AI for Metaverse, that to drive inclusion, it is crucial to break down language barriers to enable people to access the internet in their own language. How is Microsoft’s work with Indic languages shaping up?
We have been working for quite a long time on Indian languages. Windows 10 and 11 support 22 constitutionally recognised languages. Similarly, Microsoft Teams is localised in eight Indian languages on mobile and on PC, and LinkedIn has been localised in Hindi. But the idea is going beyond that.
Translation is just one of the many elements. There are other elements of language like transliteration or things around speech-to-text and text-to-speech. For instance, in India, where the use of mobile devices is high, even among those who are not literate, you cannot assume people are literate if you want to be inclusive. You have to be able to offer access to knowledge-information opportunities to everyone.
At present, Microsoft Translator supports 12 Indian languages. Assamese was the latest to be added. This capability allows us to translate conversations, captions, even translate menus, and street signs, which also means ability to translate content. When it comes to speech-to-text and text-to-speech, we are supporting some seven Indic languages from speech to text, six languages from text to speech. What that does is that, if I’m able to now query with a device, through voice, and if I can get that information using text-to-speech in terms of synthesised voice, then I have a different level of inclusion.
The other big focus area for us is Indian sign language. Under the initiative AI for Bharat, Microsoft Research is collaborating with IIT Madras because, again, Indian sign language is not the same as followed globally, as there are nuances that other languages do not have. We are trying to ensure that the disability divide is handled in sign languages, too.
Because of the nuances in Indian languages, creating the right tools or experience is still flawed. How far along is Microsoft in its journey to solve these challenges?
There are a few aspects to this. One is our own first products, like Windows 11 and Teams, where the technology is being developed and then integrated in our products.
The other capability is making these services available as tools to developers, which can then be used to create or build applications. For example, call centre analytics are using these tools. We use a lot of Hinglish in our conversation in India, so how can the agent or the manager understand whether the agent is using all the right salutations and all the right checks and balances in terms of managing the customer experience in real time?
A very interesting area where research is happening globally, and this may come to India as well in future, is transfer learning. This means one can use the learning from one language, which has a lot of data, and, using transfer learning, move into languages which are low-resource languages -- meaning those where the training data is not as large. Many Indian languages have a similar internal structure. Research is ongoing, and this can be useful in many languages that are spoken by few people.
For us, the focus has been to ensure that we are capturing the main languages and making sure that the fidelity and accuracy of the models are maintained, because the more users learn this, the better the feedback.
What are some of the scale-level AI language applications used in the Indian context?
The chatbot used by the Indian Railways is an amazing instance of scale. They use Azure but have developed their own system as well. Similarly, when Covid happened, the government came up with the Saathi App for access to information. India’s Digital India Corporation MyGov used AI and cognitive services to provide information in multiple languages. I think such instances are going to keep increasing. With more and more citizen-driven initiatives coming on board, such services will just increase. I think the potential is huge.
How significant is language going to be when one talks about the Digital India initiative?
The Digital India initiative envisages that citizens at the zilla level also have access to all services. So, access to these services in the local language will be crucial, which is where NLP also comes in. We are committed to it, and that’s why our own research efforts and product development are moving to make sure that we contribute to the government’s efforts.
Where do you see AI in language evolving?
We’ve come a long way, especially in the last few years, with the explosion of data and some of the deep-learning models that have emerged. With the work we are doing and the Indic languages that we are supporting, we have covered at least 90 per cent of the population. Now it’s a question of adopting and integrating these capabilities into more and more services, products and solutions. This is where we are working with the startup community and small and medium businesses. We will keep adding additional capabilities on the platform, such as dialects, sign language and emotion.
What are some of the top tech innovations at Microsoft India, and the top challenges you see?
Among top innovations that are transforming, one is definitely work on languages. You will continue to see us make inroads. The second area is health care, especially AI for health care. The third area is the use of AI by some startups in sustainability. Use of AI for water conservation and sustainable farming is very promising.
In terms of challenge, it will be the overall quality and scale of skilling. AI is an emerging and fast-moving technology, and hence different from our earlier IT revolution.
There is a lot of discussion about trust, ethical use of AI and the need to make sure AI systems do not create biases. How do you look at this issue?
First, the AI system appears biased because it uses data based on earlier decisions by people who were biased. AI is not magic, it is an algorithm that learns from data that has human biases. This is important to grasp, because it makes us ask how biased or unbiased we are as an organisation and a society in our allocation of resources, our hiring processes, our education processes, our planning processes, our health care processes.
Now, of course we have tools, open-source tools which can ascertain the level of bias in an algorithm. But the first step is to acknowledge that there is bias in the decision process in the first place.
So, how do we remove these biases? First, by creating awareness. Second, through tools, guidelines, and frameworks that we constantly research and put out there. Many of these toolkits are now open-source available on platforms like GitHub.
Third, by testing -- testing systems to check the level of bias. The last is by working with policymakers. For the last two-and-a-half years, we have been working with the NITI Aayog to make sure that we bring global best practices into how India is shaping our thinking and policy environment more responsibly.
The other aspect of responsible AI is also privacy. Please comment.
From Microsoft’s perspective, privacy is a fundamental principle that we work with, and it comes through how we look at our customers’ data. In the Indian context, as the policy and regulatory environment is evolving on privacy, we are clearly engaging with the government. We are preparing our technology stack for a world where privacy will become fundamental, because it is becoming important globally. There’s much happening in that space.
We see data becoming an integral part of the economy, and data-driven products and services becoming more prevalent. Confidential computing in certain areas around data governance are big investment topics of both research and product development for us at a global level, because we believe that the value of data is in unlocking it. So, we can have a General Data Protection Regulation which talks about protection, but that is not the same as unlocking the value of data protection.