With a new, non-invasive system, researchers have succeeded in translating the thoughts of test subjects into language, at least fundamentally. The subjects' brain activity was recorded using fMRI while they were listening to stories. The speech AI GPT-1, a forerunner of the current ChatGPT, generated a coherent text based on the brain scans after appropriate training. Although this did not correspond exactly to the story heard, it nevertheless clearly reflected the meaning. Although the technology is still far from a possible everyday use, the researchers hope that the system can one day help people with locked-in syndrome to communicate again.
For decades, researchers have been working on brain-computer interfaces that are intended to convert the brain activity of paralyzed people directly into speech or movement signals. The most successful so far are invasive systems in which the subjects have electrodes implanted in their brains. These can record brain activity with high spatial and temporal resolution. However, there is still a problem in that it has not yet been possible to implant electrodes in all relevant places in the brain and that the invasive procedure represents a high and risky hurdle. On the other hand, non-invasive systems that work with brain wave measurements on the scalp, for example, have the disadvantage that they are very imprecise and can only recognize individual words or phrases.
Evaluation by GPT
A team led by Jerry Tang from the University of Texas at Austin has now tested a new system on three test subjects that does not require implanted electrodes and is still able to convert at least basic thoughts into a continuous text. To do this, the researchers combined functional magnetic resonance imaging (fMRI), which shows the blood flow and thus the activity in the brain, with artificial intelligence for decoding speech.
In the preparation phase, the subjects listened to stories for a total of 16 hours while their brain activity was recorded in the MRI scanner. With this data, Tang and his team trained the software GPT-1, the forerunner of the chatbot ChatGPT. The aim was not to read the thoughts word for word. Instead, it was about capturing the meaning and translating it into language. And indeed: If the test subjects in the actual experiment listened to a previously unused story, GPT was able to reconstruct a story from the brain scan images alone that at least visibly resembled the one actually heard.
Promising but error-prone
"For a non-invasive method, this is a real leap forward compared to what has been done previously, which is typically single words or short sentences," says Tang's colleague Alexander Huth. "We get the model to decode continuous speech over longer periods of time with complicated ideas." In some cases, the sentences decoded from the brain data were surprisingly close to what was actually heard. From "I don't have a driver's license yet" the system made "She hadn't started learning to drive yet".
In many cases, however, the translation missed the original meaning. It became even more imprecise when the subjects did not hear the story but only actively imagined it or when they watched an animated silent film. "The decoder was successful in that many selected phrases in new, untrained stories contained words from the original text, or at least had a similar meaning," explains Maastricht University neuroscientist Rainer Goebel, who was not involved in the study. "But there were also quite a few errors, which is very bad for a full-fledged BCI, since for critical applications, such as communication with locked-in patients, it is particularly important not to generate false statements."
Reading thoughts in everyday life?
However, Tang and his team believe the results could lay the foundation for further research that may one day actually enable locked-in patients to communicate with their surroundings. However, an fMRT system is not suitable for this. After all, the people who take the brain scans have to lie in the huge and expensive tube. However, the technique could potentially also be transferrable to portable systems such as functional near-infrared spectroscopy (fNIRS). "fNIRS measures where in the brain there is more or less blood flow at different points in time, which turns out to be exactly the same type of signal that fMRI measures," says Huth. However, the resolution is lower, so that a much more advanced evaluation software would be necessary.
The research team also dealt with the question of whether the technology could be misused to read out thoughts against a person's will. However, the experiments have shown that the evaluation only works for the person for whom the system was trained for hours and with the active cooperation of the person in question. And even then, it only produced meaningful results if the test subject was actively thinking about the story during the measurement. As soon as she let her mind wander, the software was no longer able to read the thoughts. "We take concerns very seriously that the process could be used for bad purposes, and have worked to avoid that," Tang says want and it helps them.”
Source: Jerry Tang (University of Texas at Austin) et al., Nature Neuroscience, doi: 10.1038/s41593-023-01304-9