The latest iteration of Amazon’s Alexa voice assistant could sound eerily familiar.
The company announced on Wednesday during its annual re:MARS conference, which focuses on artificial intelligence innovation, that it’s working on an update to its Alexa system that would allow the technology to mimic any voice, even a deceased family member.
In a video shown on stage, Amazon (AMZN) demonstrated how, instead of Alexa’s signature voice reading a story to a young boy, it was his grandmother’s voice.
Rohit Prasad, an Amazon senior vice president, said the updated system will be able to collect enough voice data from less than a minute of audio to make personalization like this possible, rather than having someone spend hours in a recording studio like how it’s done in the past. Prasad did not elaborate on when this feature could launch. Amazon declined to comment on a timeline.
The concept stems from Amazon looking at new ways to add more “human attributes” to artificial intelligence, especially “in these times of the ongoing pandemic, when so many of us have lost someone we love,” Prasad said. “While AI can’t eliminate that pain of loss, it can definitely make their memories last.”
Amazon has long used recognizable voices, such as the real voices of Samuel L. Jackson, Melissa McCarthy and Shaquille O’Neal, to voice Alexa. But AI recreations of people’s voices have also increasingly improved over the past few years, particularly with the use of AI and deepfake technology. For example, three lines in the Anthony Bourdain documentary “Roadrunner” were generated by AI, even though it sounded like they were said by the late media personality. (This particular case raised a stir because it was not made clear in the movie that the dialog was AI generated and had not been approved by Bourdain’s estate). “We can have a documentary-ethics panel about it later,” director Morgan Neville told The New Yorker when the film debuted last year.
More recently, actor Val Kilmer, who lost his voice to throat cancer, partnered with startup Sonantic to create an AI-driven speaking voice for him in the new “Top Gun: Maverick” film. The company used archival audio footage of Kilmer to teach an algorithm how to speak like the actor, according to Variety.
Adam Wright, a senior analyst at IDC Research, said he sees the value in Amazon’s effort.
“I think Amazon is interested in doing this because they have the capability and technology, and they are always searching for ways to elevate the smart assistant and smart home experience,” Wright said. “Whether it drives a deeper connection with Alexa, or just becomes a skill that some folks dabble with from time to time remains to be seen.”
Amazon’s foray into personalized Alexa voices may struggle most with the uncanny valley effect — recreating a voice that is so similar to a loved one’s but isn’t quite right, which leads to rejection by real humans.
“There are certainly some risks, such as if the voice and resulting AI interactions doesn’t match well with the loved ones’ memories of that individual,” said Micheal Inouye of ABI Research. “For some, they will view this as creepy or outright terrible, but for others it could be viewed in a more profound way such as the example given by allowing a child to hear their grandparent’s voice, perhaps for the first time and in a way that isn’t a strict recording from the past.”
He believes, however, the varying reactions to announcements like this speak to how society will have to adjust to the promise of innovations and their eventual reality in the years ahead.
“We’ll definitely see more of these types of experiments and trials — and at least until we get a higher comfort level or these things become more mainstream, there will still be a wide range of responses,” he said.