New language translation tool unveiled at the 20th International WWW conference
Former President APJ Abdul Kalam today launched the Machine translation (MT) systems, also called content multiplier tools, developed with funding from TDIL (Technology Development for Indian Languages) programme at the 20th International World Wide Web conference in Hyderabad.
Based on the Computational Paninian grammar (CPG), which works very well for free word-order languages, Indian languages in particular, the tools are available in three modules — Sampark (Indian to Indian), AnglaMT (English to Bengali, Malayalam, Punjabi and Urdu) and Anvadaksh (English to Hindi, Bengali, Marathi, Oriya, Urdu and Tamil).
“India has more than 122 languages of which 22 are official. More than a billion people all over the world speak either Hindi, Bengali, Telugu, Marathi, Tamil or Urdu. With the availability of e-content and development of language technology, it is now possible to overcome the language barrier,” Rajeev Sangal, director of IIIT-Hyderabad (one of the 17 institutions that participated in the development of the tools), told mediapersons.
Sangal said three consortia comprising 17 academic and research institutions were involved in building 26 different pairs of languages. Right now, 12 pairs are available and the plan is to release more pairs every three-four months, he added.
“An amount of Rs 13 crore went into the whole exercise and about 200 students were directly involved in the development of these tools. An additional 200 students worked on the project as a thesis and their algorithms were embedded into the systems directly,” Sangal said.
More From This Section
The MT quality, Sangal said, was better in case of translation between Indian languages because they were similar in many ways, both in grammar and vocabulary. Translation between English and Indian languages is harder and hence the output quality is likely to be inferior.
“We are now asking users to try using the tool as an experiment. Our only focus now is on improving quality,” he said.
Stating that the MT technology is currently freely accessible, Sangal said the MT initiative would be monetised when special needs for further research arises.
“Company-based efforts can also be initiated. We are already working with translation houses and publishers,” he said, adding students at the IIT-Hyderabad were working on a text-to-speech system, and at some point these two technologies (text-to-speech and MT systems) will be combined together.