A new invention by scientists can translate brain activity directly into speech, offering a vast improvement in communication for people who have lost their ability to speak from conditions like throat cancer, Parkinson’s, and ALS, according to The Guardian.
Currently, many of these patients use speech synthesizers that spell out words one letter at a time, through eye or facial muscle movements. These methods allow speech at a rate of around 8 words per minute, compared to as many as 150 during natural speech.
The late physicist Stephen Hawking, who suffered from ALS, used a variety of such systems over the course of his life.
The research was published Wednesday in the journal Nature.
According to senior author of the research and University of California San Francisco (UCSF) professor of neurological surgery, Edward Chang:
“For the first time…we can generate entire spoken sentences based on an individual’s brain activity. This is an exhilarating proof of principle that, with technology that is already within reach, we should be able to build a device that is clinically viable in patients with speech loss.”
Unlike past attempts which revolved around the way the sounds of speech are represented in the brain, the new research concentrates on the signals from the brain that move the tongue, lips, jaw, and throat to produce speech.
“We reasoned that if these speech centers in the brain are encoding movements rather than sounds, we should try to do the same in decoding those signals,” said the paper’s first author, Gopala Anumanchipalli, a speech scientist at UCSF.
The researchers temporarily implanted electrodes in the brains of patients that were having neurosurgery for epilepsy. The patients read hundreds of sentences out loud while researchers tracked a part of the brain used in speech. Researchers tracked how the brain’s signals turned into vocal movements, and then used previously compiled data showing how vocal movements produce speech sounds. Using a machine learning algorithm, the researchers linked patterns of brain signals to vocal movements.
The technology, which the researchers call a “virtual vocal tract,” is controlled directly by the brain to synthesize speech. First, electrodes take in signals from the brain. They’re decoded into approximations of vocal movements, and those movements are then approximated into synthesized speech.
Listeners perfectly transcribed the synthetic speech 43 percent of the time, with some sounds more reliably intelligible than others. Scientists say that people become familiar with the unique speech patterns of others over time, and this would likely not be a barrier to communication in the long run.
Next, researchers will determine whether the system can be used by people who can’t speak, and therefore can’t train the system using their own voice. If successful, the system could revolutionize speech synthesis.