Along with issues of disclosure and consent, the incident reignited the debate over technology, not just in entertainment, but in politics and a business sector focused on turning text into realistic speech.
By Matt O’Brien and Barbara Ortutay / AP
The revelation that a documentary filmmaker used voice cloning software to get the late chef Anthony Bourdain to say words he never said has drawn criticism over ethical concerns over the use of the powerful technology.
The movie Roadrunner: A Film About Anthony Bourdain hit theaters on Friday and mainly features live footage of the beloved celebrity chef and globetrotting TV host before his death in 2018.
However, its director, Morgan Neville, told The New Yorker that a dialogue snippet was created using artificial intelligence (AI) technology.
This has reignited a debate about the future of voice cloning technology, not only in the entertainment world, but also in politics and a rapidly growing business sector dedicated to transforming text into realistic human speech.
“Unapproved voice cloning is a slippery slope,” Andrew Mason, founder and CEO of Descript Inc. voice generator, wrote in a blog post on Friday. “As soon as you step into a world where you make a subjective judgment on whether specific cases can be ethical, it won’t be long before something happens.”
Prior to this week, most of the public controversy around these technologies focused on the creation of hard-to-detect deepfakes using audio and / or video simulations and their potential to fuel disinformation and political conflict.
Still, Mason, the founder and CEO of Groupon, said in an interview that Descript has repeatedly rejected requests for return with a voice, including from “people who have lost someone and are grieving.” .
“It’s not even that we want to pass judgment,” he said. “We’re just saying you have to have clear lines in what’s right and what’s wrong.”
Angry and uncomfortable reactions to voice cloning in the Bourdain case reflect expectations and issues with disclosure and consent, said Sam Gregory, program director at Witness, a nonprofit working on the use of video technology for human rights.
Obtaining consent and disclosing the technology at work would have been appropriate, he said.
Instead, viewers were stunned – first by the fact of the fake audio, then by the director’s apparent rejection of any ethical issues – and expressed their displeasure online.
“It also touches on our fears of death and our ideas of how people could take control of our digital likeness and make us say or do things with no way to stop it,” Gregory said.
Neville did not identify the tool he used to recreate Bourdain’s voice, but said he used it for a few sentences Bourdain wrote but never said out loud.
“With the blessing of his real estate and literary agent, we have used AI technology,” Neville said in a written statement. “It was a modern storytelling technique that I used in a few places where I felt it was important to bring Tony’s words to life.”
Neville also told GQ magazine that he got the approval of Bourdain’s widow and literary executor.
“I was definitely NOT the one who said Tony would have been cool with this,” the chef’s wife, Ottavia Busia, wrote on Twitter.
Although tech giants such as Microsoft Inc, Alphabet Inc’s Google and Amazon.com Inc have dominated text-to-speech research, there are now also a number of startups such as Descript that offer a voice cloning software. Uses range from talking customer service chatbots to video games and podcasts.
Many of these voice cloning companies feature an ethics policy on their website that explains the terms of service. Of nearly a dozen companies contacted by The Associated Press, many said they had not recreated Bourdain’s voice and would not have done so if asked.
Others did not respond.
“We have pretty strict policies regarding what can be done on our platform,” said Zohaib Ahmed, founder and CEO of Resemble AI, a Toronto-based company that sells a personalized AI voice generator service. “When you create a voice clone, it requires the consent of anyone’s voice. “
Ahmed said the rare occasions he allowed posthumous voice cloning were for academic research, including a project working with the voice of former British Prime Minister Winston Churchill, who died in 1965.
Ahmed said a more common commercial use is to edit a TV commercial recorded by real voice actors and then customize it to suit a region by adding a local citation. It’s also used to dub animated films and other videos, taking a voice in one language and having it speak a different language, he said.
He compared it to past innovations in the entertainment industry, from stuntmen to green screen technology.
A few seconds or minutes of recorded human speech can help teach an AI system to generate its own synthetic speech, although allowing it to capture the clarity and rhythm of Anthony Bourdain’s voice probably took a lot more training. said Rupal Patel, a professor at Northeastern University who is the founder and CEO of another voice generation company, VocaliD Inc, which focuses on customer service chatbots.
“If you wanted him to really speak like him, you would need a lot, maybe 90 minutes of good clean data,” she said. “You are building an algorithm that learns to speak like Bourdain spoke. “
Neville is an acclaimed documentary filmmaker who also portrayed Fred Rogers Won’t You Be My Neighbor? and the Oscar-winning film 20 Feet From Stardom.
He began directing his last film in 2019, more than a year after Bourdain’s death by suicide.
Comments will be moderated. Keep comments relevant to the article. Comments containing abusive and obscene language, personal attacks of any kind or promotion will be removed and the user banned. The final decision will be at the discretion of the Taipei Times.