The iPhone 4s with the Siri voice command system has been out for a while. As expected there are now several examples available of how well it works with real people using real accents. Here is one with a Scottish Accent.
And cnet did a methodical test with several East Asian accents.
With the technical challenges of voice recognition, I am not surprised that it has some trouble. I am impressed that it works as well as it does. For example, with a standard accent, the recognition quality is clear and substantially improved, as shown in this video from cnet.
The difference in understanding various accents is most obvious with this almost painful video where Siri specifically asks the user to choose between “home” and “work”, but over and over replies that it cannot understand the Japanese-accented responder’s answer, eventually coming up with a profane interpretation of the user’s attempt to say “Work”. Or perhaps Siri, after a number of successive failed attempts, begins to presume that the user is swearing at it! (warning, the clip does contain profanity)
Here is the ethical dilemma. We used to send schoolchildren to speech therapists to get rid of their ‘funny’ accents. In business somebody for whom saying ‘work’ sounded a little too much like ‘wok’ was branded as ‘doesn’t communicate well’, even though they were understood well enough. Accent elimination tutoring was common for both children and adults because ‘speaking like an American’ was considered a prerequisite for success. Now, this behavior is considered the language equivalent of forcing left-handed children to write with their right hand. No legitimate speech therapist would ever recommend accent elimination classes, and they are vigilant never to misdiagnose an accent as a problem requiring treatment. Even when therapists are re-teaching people to speak again after brain trauma, they are expected to re-train them with their culturally unique features intact.
Now there is Siri. And one person gets to just say ‘send a message to my mom saying I’ll be home late’ while another shouts ‘Work’ over and over again trying to be understood. If it was a person treating these two people so differently, for example in a retail store, would we tolerate it? Is is acceptable if it is one of our machines discriminating against those who do not conform to the accepted standard? It is not an excuse to say that Siri is only sold to US, England, and Australian customers. I hear the accents in the videos (plus many others) daily in Los Angeles. Most other large US cities would as well.
Are we going to see a resurgence in accent elimination classes so people can use their phone as well as ‘real Americans’?
Siri has been compared to ‘the computers on Star Trek’. In the original series the computer understood everybody better than the people did but in the 2009 version J. J. Abrams had a little fun with Chekhov:
But consider this. If people had been having this kind of problem for three centuries, would there still be Russian accented English?
I’m not sure, given the best of current technology, where the answer lies. But unlike some of the Siri-defending commenters on these videos, the answer is not in saying that “They just need to learn to speak clearer.” If we let technology force us to speak all alike then we are allowing technology to take from us something important of our human diversity. Technology does not deserve it.