Reading, watching, and listening to the heated discourse on all this, I get worried, but not for the reasons that you may imagine, dear reader.
I get worried because our country, India, has seen social upheaval whenever major technological waves occur. We are a country where a smart Bombay-based, London-educated lawyer saw a political opportunity when indigo farms in the remote Champaran region of Bihar collapsed. He attributed their collapse to their British owners’ exploitation of their Indian workers whereas the true reason was that German scientists synthetically produced indigo, using the then newly founded science of chemistry, at a fraction of the price of the one in Champaran. We all know the rest of this story: Who this smart lawyer was and how he used this opportunity to drive all British out of India.
Whenever I bring this up in a discussion, I am told: “All that was in India of 1917, we have moved on since then.” I then point out that the flag of our Independence movement for decades had a charkha, cheering the hand-spinning wheel used by Indian cotton spinners and opposing the spinning and weaving machines in spite of the fact that these machines made cloth affordable for the citizens of Bharat. And that’s the reason I am a worried man nowadays.
The revolution under way today, commonly headlined “ChatGPT”, is the revolution in computer science that helps us synthetically create human-like answers to almost any kind of question posed to it. These techniques, called “NLP”, for Natural Language Processing, are used to extract patterns from textual data similar to how chemistry scientists did with naturally occurring substances in the late 19th century and thus synthetically created indigo.
But why should this worry us?
On my way to work every day in Bombay, I get a jolt that reminds me of another trait of our Indian culture. This happens when I pass the Shitala Devi Temple in Mahim.
For centuries in India, diseases like smallpox every year caused deaths of the order of 1.5 million. All we could do was pray to the ever cool and graceful goddess Shitala Devi to escape the terror of smallpox. Apart from the Shitala Devi Mandir at Mahim in Mumbai, there are numerous such temples all over northern and eastern India. What arrested the small-pox epidemics in India, however, were not these prayers but the scientific breakthrough called “vaccination” of the early 20th century.
However, the Shitala Devi temples in India continue to prosper.
Breakthroughs in finding medical solutions to cholera, plague, and a hundred other major illnesses were found when scientists systematically decoded the underlying patterns of what caused them.
What is under way in Natural Language Processing is a similar wave of experimentation and innovation. This experimentation has reached a productionising stage with a group of techniques called GPT, which expands to “Generative Pre-trained Transformers”. The key concept is “Pre-Trained”. Here is a parallel example: When you, a person who has a driving-licence, gets down to drive a new model of a car, you are using your “pre-trained” driving skills. In other words, your learning how to drive has “pre-trained you on” when to press the accelerator, when to press the brake, when to change gear, and so on.
The computer science world has in the past two or three years learned to “pre-train” their algorithms to look at a sequence of words, detect a pattern, and anticipate the set of words to follow that makes most sense. To learn these patterns, they have looked at, for example, all the content of Wikipedia and collections on the internet. Thus, they have trained their algorithms to “generate” new content based on these patterns just as in learning to drive a car you learn how to generate suggestions in your mind even when you are driving in a foreign country. The word “Generative” in GPT comes from this.
ChatGPT, for example, which everyone is oohing and aahing about, has similarly “learnt” (as you learned driving) the patterns in human language, common word sequences, and patterns of word usage from data on the internet like Wikipedia.
The benefits and costs of GPTs will be similar to the benefits and costs of spinning and weaving machines, and vaccines and gene therapy. Just as spinning and weaving machines made cloth affordable to the common man of Bharat and vaccines made the common man of Bharat disease-free, GPTs will make things like education, legal services, health services, and even computer programming service affordable and of benefit to the common man of Bharat. But it could also mean lower incomes and job opportunities for certain types of professions like teachers, lawyers, doctors, and, hold your breath, computer programmers.
If there is anything we need to worry about during the NLP revolution that we are living through right now, it is this: By the very nature of the way ChatBots, etc are built, they can come embedded with lots of biases. These could be racial biases in, for example, crime prediction, gender discrimination in recruitment, and, in the Indian case, specifically, caste biases. If unchecked these could create social upheavals and lead to the kind of anti-technology movements as we have seen in the past in India.