Microsoft has built a new artificial intelligence (AI) system that is capable of doing text-to-speech in an 'almost' unsupervised environment.
Using text-to-speech and automatic speech recognition (ASR), the AI system trained using only 200 speech and text data to generate realistic speech for about 20 minutes of audio transcriptions, the research paper explained.
The method achieved 99.84 per cent in terms of world level intelligible rate, paving way for more accessible text-to-speech recognition systems.