It caused the woman, who was receiving treatment for the tumor, to be unable to speak. Artificial intelligence gave him his voice back

Young American Alexis “Lexi” Bogan sounded enthusiastic before undergoing life-saving and life-changing surgery.

He loved singing songs by Taylor Swift and Zach Bryan in the car. He was always laughing; even when picking up misbehaving preschoolers or discussing politics with friends. She was a soprano in the choir at school.

Then overnight that voice disappeared.

In August last year, doctors removed the tumor located near the back of his brain. When the breathing tube came out a month later, Bogan had trouble swallowing and forced himself to say “hello” to his family.

Months of rehabilitation helped him recover, but his speech remains slurred. Friends, strangers, and his own family members have difficulty understanding what he is trying to tell them.

In April, the 21-year-old regained her old voice. It’s not the real thing, it’s a voice clone created by artificial intelligence (AI) technology from ChatGPT maker OpenAI that can be summoned from a phone app.

Fatigue over AI deepfakes

Trained with a 15-second time capsule of her teenage voice taken from a cooking demonstration video she recorded for a school project, her synthetic yet highly realistic-sounding AI voice can now say almost anything it wants.

He types a few words or sentences into his phone and the app instantly reads it out loud.

“Hi, can I please have a large espresso shaken with oat milk with iced brown sugar?” Bogan’s AI voice said as he held the phone out his car window during the Starbucks drive-thru. said.

Experts have warned that rapidly developing AI voice cloning technology could fuel phone scams, disrupt democratic elections and violate the dignity of people (living or dead) who have never consented to having their voices recreated to say things they have never spoken.

It was used to generate deepfake robocalls to New Hampshire voters impersonating US president Joe Biden.

Authorities in the US state of Maryland recently accused a high school athletic director of using artificial intelligence to create a fake audio clip of the school principal making racist remarks.

But Bogan and a team of doctors at Rhode Island’s Lifespan hospital group believe they’ve found a use for it that justifies the risks.

Alexis Bogan answers a journalist's question with an application that mimics his lost voice. — Alexis Bogan answers a journalist’s question with an application that mimics his lost voice. -Josh Reynolds/AP

Recreating lost sounds

Bogan is one of the first people able to recreate a lost voice, the only person with his condition. OpenAI’s new Audio Engine.

Some other AI providers, such as startup ElevenLabs, have tested similar technology for people with speech impediments and speech loss; This includes a lawyer who now uses his voice clone in the courtroom.

We must be aware of the risks, but we cannot forget the patient and social benefit.

D., a neurosurgery resident at Brown University medical school and Rhode Island Hospital. “As technology evolves, we hope Lexi will lead the way,” Rohaid Ali said.

Millions of people with debilitating strokes, throat cancer or neurogenerative diseases could benefit, he said.

“We must be aware of the risks, but we cannot forget the patient and social benefit,” said Fatima Mirza, another resident doctor working on the pilot project. “We are able to help give Lexi her true voice back and she can speak in the expressions that are truest to her.”

Mirza and Ali, who are married, caught the attention of ChatGPT maker OpenAI for their previous research project at Lifespan, which used the AI chatbot to simplify medical consent forms for patients.

The San Francisco company was approached earlier this year as it sought promising medical applications for its new AI voice generator.

slow recovery

Bogan was still slowly recovering from surgery.

The illness began last summer with headaches, blurred vision and a droopy face, alarming doctors at Hasbro Children’s Hospital in Providence.

When I lost my voice it was almost like a part of my identity was taken away.

They discovered a vascular tumor the size of a golf ball that was pressing on the brainstem and tangled up in blood vessels and cranial nerves.

“It was a fight to control the bleeding and remove the tumour,” said pediatric neurosurgeon Dr Konstantina Svokos.

Svokos said the location and severity of the tumor and the complexity of the 10-hour surgery damaged Bogan’s control over his tongue muscles and vocal cords, hindering his ability to eat and speak.

“When I lost my voice, it was almost like a piece of my identity was taken away,” Bogan said.

The feeding tube came out this year. Speech therapy continues and he is able to speak clearly in a quiet room, but there is no sign that he will regain the full clarity of his natural voice.

“At one point I was starting to forget what my voice sounded like,” Bogan said. “I’m so used to how my voice sounds now.”

‘Training’ on how to talk to AI

He was handing the phone to his mother to answer her calls when the phone rang at the family’s home in suburban North Smithfield.

He felt like he was a burden to his friends when they went to a noisy restaurant. His father, who had hearing loss, had difficulty understanding him.

At the hospital, doctors were looking for a pilot patient to test OpenAI technology.

“The first person that came to Dr. Svokos’ mind was Lexi,” Ali said. “We reached out to Lexi to see if she would be interested, not knowing what her response would be. She was ready to try it and see how it would work.”

Bogan had to go back several years to find a suitable recording of his voice in order to “train” the AI system on how he spoke. It was a video explaining how to make pasta salad.

Their doctors deliberately fed the AI system only a 15-second clip. Cooking sounds make other parts of the video glitchy. It was also all OpenAI needed; an improvement over previous technology that required much longer samples.

They also knew that getting something useful in 15 seconds could be vital for future patients who have no trace of their voice on the internet. A short voicemail left for a relative may be sufficient.

‘I get emotional every time I hear your voice’

Everyone was blown away by the quality of the voice clone when they tested it for the first time. Occasional glitches (a mispronounced word, a missing intonation) often went unnoticed.

In April, doctors equipped Bogan with a custom-made phone app that only he could use.

“I get so emotional every time I hear his voice,” his mother, Pamela Bogan, said with tears in her eyes.

“I think it’s great to be able to have that voice again,” Lexi Bogan said, adding that it helped “somewhat bring my confidence back to where it was before all this happened.”

He now uses the app about 40 times a day and sends feedback that he hopes will help future patients.

One of her first experiments was talking to children at the kindergarten where she worked as a teacher’s assistant.

He typed “ha ha ha ha”, expecting a robotic response. Surprisingly, his voice resembled his old smile.

He used it to ask where to find items at Target and Marshall’s. It helped him reconnect with his father. And it made it easier for her to order fast food.

Bogan’s doctors have begun cloning the voices of other willing Rhode Island patients and hope to bring the technology to hospitals around the world.

OpenAI said it was cautious about expanding use of the Voice Engine, which is not yet publicly available.

A number of small AI startups are already selling voice cloning services to entertainment studios or making them more widely available.

Most audio creation providers say they prohibit impersonation or abuse, but they vary in how they enforce their terms of use.

Alexis Bogan (center) and her mother Pamela Bogan (right) with Dr. They react when they hear that her lost voice has been revived from a prompt written by Fatima Mirza (left). -Josh Reynolds/AP

Wider access to voice cloning with artificial intelligence

“We want to make sure that everyone whose voice is used in the service is consistently consenting,” said Jeff Harris, product lead at OpenAI.

“We want to make sure it’s not used in political contexts. So we’ve taken the approach of being very limited in who we give the technology to.”

OpenAI’s next step involves developing a secure “voice authentication” tool so users can only transcribe their own voice, Harris said. He said this could be “limiting for a patient like Lexi who suddenly loses their ability to speak.”

“So we think we need to build high-trust relationships, especially with medical providers, to provide a little more unrestricted access to technology.”

Bogan impressed his doctors with his ideas about how technology could help others with similar or more severe speech impediments.

“One of the things he was doing throughout this whole process was thinking of ways to fix this and change it,” Mirza said. “He has been a great inspiration to us”.

For now, Bogan has to fiddle with his phone to make the voice engine talk, but he dreams of an AI voice engine that combines with the human body and improves on old treatments for improving speech (like a robot-sounding electrolarynx or voice prosthesis). or translate words in real time.

As he gets older and the AI continues to sound the same as it did in his youth, he becomes less certain about what will happen. Maybe technology could “age” his AI voice, he said.

For now, he said, “Even though I don’t get my voice back completely, I have something that will help me find my voice again.”

Fatigue over AI deepfakes

Recreating lost sounds

slow recovery

‘Training’ on how to talk to AI

‘I get emotional every time I hear your voice’

Wider access to voice cloning with artificial intelligence

Leave a Reply Cancel reply