Think you know the ins and outs of Australian English? Amazon is hiring. “Amazon is seeking a linguist with an Australian background to join our data team,” reads a job description on its website. The job will involve transcribing, as well as annotating speech and language data, and it is one of the several openings for linguists, including those familiar with Canadian French, American English and Canadian English.
Though the job posting didn’t get into such specifics, it’s very likely to involve research for Amazon’s voice-recognition device, the Amazon Echo, which helps users check the weather, stream music and plan to-do lists through a personal assistant named Alexa. The inability of some software to recognise accents is a well-documented frustration among people who speak English with a non-American accent.
A BBC Scotland comedy bit from 2010 involving two Scottish men trapped in a voice-activated elevator has received over a million views. (“Please remain calm,” the operator intones as the men begin screaming in frustration.)
The effort by the Seattle-based Amazon to hire more linguists with backgrounds outside American English may point to an acknowledgement that such systems must adapt to all types of voices if they are to appeal to multiple markets.
“If people like Google, Apple and Amazon are putting money into this stuff, localisation needs to be a part of it,” said Simon Musgrave, a lecturer at Monash University in Melbourne.
Current voice-recognition software, he explained, draws from a training database of sounds and uses statistical modelling to match audio with the understanding of vowels, consonants and words.
“They’re going to need a body of recordings from Australians with good annotations” that connect audio to meaning, Musgrave said.
The Australian accent, for example, is distinct from American English in that it is non-rhotic — the “r” sound is not pronounced. Australian English also has a lot of diphones and triphones — “multiple vowels within the same space,” said Howard Manns, a lecturer in linguistics at Monash University. “And that might be difficult for a computer or voice recognition to make sense of,” Manns said.
The software is getting better. In May, Sundar Pichai, the chief executive of Google, announced that the company’s voice-recognition software had a word error rate of less than five per cent, an improvement on the 23 per cent error rate it had in 2013.
Jobs that tie linguistics and technology are becoming more common.
For example, Appen, a company based in Australia that collects data for machine learning for technology companies, employs more than 70 linguists and has thousands of other linguists that it can consult.
Linguists annotate recordings of people speaking, down to the pauses in their voices, which are then fed into an algorithm that connects the audio with the meaning, said Phil Hall, the senior vice-president of Appen’s language resources.