“Creepy.” “Morbid.” “Monstrosity.” Those were just some of the reactions that poured in over social media when Amazon.com Inc.’s Alexa digital assistant impersonated a grandmother reading an excerpt from “The Wonderful Wizard of Oz.”
It all started innocently enough, with Alexa chief scientist Rohit Prasad trying to demonstrate the digital assistant’s humanlike mien during a company presentation Wednesday. Prasad said he’d been surprised by the companionable relationship users develop with Alexa and wanted to explore that. Human characteristics like “empathy and affect” are key for building trust with people, he added.
“These attributes have become more important in these times of the ongoing pandemic, when so many of us have lost someone we love,” he said. “While AI can’t eliminate that pain of loss, it can definitely make their memories last.”
The presentation left the impression that Amazon was pitching the service as a tool for digitally raising the dead. Prasad walked that back a bit in a subsequent interview on the sidelines of Amazon’s re:MARS technology conference in Las Vegas, saying the service wasn’t primarily designed to simulate the voice of dead people.
“It’s not about people who aren’t with you anymore,” he said. “But it’s about, your grandma, if you want your kid to listen to grandma’s voice you can do that, if she is not available. Personally I would want that.”
As the presentation ricocheted around the web, the creep factor dominated the discourse. But more serious concerns emerged, as well. One was the potential for deploying the technology to create deepfakes — in this case using a legitimate recording to mimic people saying something they haven’t actually vocalized.
Siwei Lyu, a University of Buffalo professor of computer science and engineering whose research involves deepfakes and digital media forensics, said he was concerned about the development.
“There are certainly benefits of voice conversion technologies to be developed by Amazon, but we should be aware of the potential misuses,” he said. “For instance, a predator can masquerade as a family member or a friend in a phone call to lure unaware victims, and a falsified audio recording of a high-level executive commenting on her company’s financial situation could send the stock market awry.”
While Amazon didn’t say when the new Alexa feature would be rolled out, similar technology could eventually make such mischief a lot easier. Prasad said Amazon had learned to simulate a voice based on less than a minute of that person’s speech. Pulling that off previously required hours in a studio.