Amazon’s Alexa continues to learn new party tricks, with the latest being a “newscaster style” speaking voice that will be launching on enabled devices in a few weeks’ time.
You can listen to samples of the speaking style below, and the results, well, they speak for themselves. The voice can’t be mistaken for a human, but it does incorporates stresses into sentences in the same way you’d expect from a TV or radio newscaster. According to Amazon’s own surveys, users prefer it to Alexa’s regular speaking style when listening to articles (though getting news from smart speakers still has lots of other problems).
Amazon says the new speaking style is enabled by by the company’s development of “neural text-to-speech” technology or NTTS. This is the next generation of speech synthesis, that use machine learning to generate expressive voices more quickly. Currently, Alexa uses uses concatenative speech synthesis, a method that’s been around for decades. This involves breaking up speech samples into distinct sounds (known as phonemes) and then stitching them back together to form new words and sentences.