Google has recently announced a number of updates to the technologies that underpin its Contact Center AI solutions. In particular, Automatic Speech Recognition (ASR).
ASR goes under several names including computer transcription, digital dictaphone and quite wrongly, voice recognition. Speech recognition is all about what someone is saying, whereas voice recognition is about who is saying it. This has been further confused  by the launch of products such as Apple’s Siri and Amazon’s Alexa which are described as Voice Assistants. However, we shall stick to the term ASR.
What’s the point of ASR?
The applications for ASR are potentially massive – there are so many situations where it can provide either significant financial or ergonomic  benefit. With computers costing far  less than people,  it’s easy to see why ASR might be a cost-effective way to input data. For a start, most people can speak three times faster than they can type so in any job involving inputting text, the gains are mainly in saving the time of salaried employees. A lawyer might charge his/her time out at hundreds of pounds per hour and a good proportion of billed time will be taken up writing documentation. Lawyers are rarely fast and accurate typists, so dictating to a computer which costs perhaps £2.00 per hour to run provides an obvious benefit. Increasingly, companies of all sizes are finding that ASR helps enormously with productivity.
But there are also situations where speech is the only available control media. This can range from systems that require input from the handicapped, to people who are already mentally overloaded such as  fighter pilots.
ASR isn’t easy
But recognising speech is not easy. Like many other Artificial Intelligence (AI) problems, some researchers have tackled ASR by looking at the way that humans speak and understand speech. Others have tried to solve the problem by

View Entire Article on