Physical Address

304 North Cardinal St.
Dorchester Center, MA 02124

AI re-creates what people see by reading their brain scans

As neuroscientists struggle to demystify how the human brain converts what our eyes see into mental images, artificial intelligence (AI) has been getting better at mimicking that feat. A recent study, scheduled to be presented at an upcoming computer vision conference, demonstrates that AI can read brain scans and re-create largely realistic versions of images a person has seen. As this technology develops, researchers say, it could have numerous applications, from exploring how various animal species perceive the world to perhaps one day recording human dreams and aiding communication in people with paralysis.

Many labs have used AI to read brain scans and re-create images a subject has recently seen, such as human faces and photos of landscapes. The new study marks the first time an AI algorithm called Stable Diffusion, developed by a German group and publicly released in 2022, has been used to do this. Stable Diffusion is similar to other text-to-image “generative” AIs such as DALL-E 2 and Midjourney, which produce new images from text prompts after being trained on billions of images associated with text descriptions.

For the new study, a group in Japan added additional training to the standard Stable Diffusion system, linking additional text descriptions about thousands of photos to brain patterns elicited when those photos were observed by participants in brain scan studies.

Unlike previous efforts using AI algorithms to decipher brain scans, which had to be trained on large data sets, Stable Diffusion was able to get more out of less training for each participant by incorporating photo captions into the algorithm. It’s a novel approach that incorporates textual and visual information to “decipher the brain,” says Ariel Goldstein, a cognitive neuroscientist at Princeton University who was not involved with the work.

The AI algorithm makes use of information gathered from different regions of the brain involved in image perception, such as the occipital and temporal lobes, according to Yu Takagi, a systems neuroscientist at Osaka University who worked on the experiment. The system interpreted information from functional magnetic resonance imaging (fMRI) brain scans, which detect changes in blood flow to active regions of the brain. When people look at a photo, the temporal lobes predominantly register information about the contents of the image (people, objects, or scenery), whereas the occipital lobe predominantly registers information about layout and perspective, such as the scale and position of the contents. All of this information is recorded by the fMRI as it captures peaks in brain activity, and these patterns can then be reconverted into an imitation image using AI.

In the new study, the researchers added additional training to the Stable Diffusion algorithm using an online data set provided by the University of Minnesota, which consisted of brain scans from four participants as they each viewed a set of 10,000 photos. A portion of these brain scans from the same four participants were not used in training and were used to test the AI system later.

Each AI-generated image starts out as noise, reminiscent of TV static, and replaces the noise with distinguishable features as the Stable Diffusion algorithm compares a person’s brain activity patterns from viewing a photo with the patterns in its training data set. The system effectively generates an image depicting the contents, layout, and perspective of the photo being viewed. Takagi says the new system was more efficient than previous ones, required less fine-tuning, and could be trained with a smaller data set.

Brain activity, predominantly in the occipital lobe, provided enough information to re-create the layout and perspective of the photos being viewed, the researchers found. But the algorithm struggled to recapitulate objects, such as a clock tower, from the real photo and instead created abstract figures. One approach to tackling this problem would be to use larger training data sets that could train the algorithm to predict more details, but the fMRI data set was too limited for this, the Japanese team says.

Instead, the researchers circumvented this issue by harnessing keywords from image captions that accompanied the photos in the Minnesota fMRI data set. If, for example, one of the training photos contained a clock tower, the pattern of brain activity from the scan would be associated with that object. This meant that if the same brain pattern was exhibited once more by the study participant during the testing stage, the system would feed the object’s keyword into Stable Diffusion’s normal text-to-image generator and a clock tower would be incorporated into the re-created image, following the layout and perspective indicated by the brain pattern, resulting in a convincing imitation of the real photo.

Stable Diffusion can re-create photos (left) seen by study participants. Using patterns of brain activity alone, it correctly reproduces the layout and perspective (middle), but with the addition of textual information, it can also correctly re-create the object in the photo (right). Creative Commons

Importantly, the stable diffusion algorithm doesn’t receive a text prompt directly from the test data—it can only infer that an object is present if the brain pattern matches one seen in the training data. This limits the objects it can re-create to those present in the photos used during training.

Finally, the researchers tested their system on additional brain scans from the same participants when they viewed a separate set of photos, including a toy bear, airplane, clock, and train. By comparing the brain patterns from those images with those produced by the photos in the training data set, the AI system was able to produce convincing imitations of the novel photos. (The team posted a preprint of its work in December 2022.)

“The accuracy of this new method is impressive,” says Iris Groen, a neuroscientist at the University of Amsterdam who was not involved with the work.

However, the AI system was only tested on brain scans from the same four people who provided the training brain scans, and expanding it to other individuals would require retraining the system on their brain scans. So, it may take a while for this technology to become widely accessible. Nonetheless, Groen argues that “these diffusion models have [an] unprecedented ability to generate realistic images,” and could create new opportunities for cognitive neuroscience research.

Shinji Nishimoto, another systems neuroscientist at Osaka University who worked on the study, hopes that with further refinements the technology could be used to intercept imagined thoughts and dreams, or could allow scientists to understand how differently other animals perceive reality.

Source link

Leave a Reply

Your email address will not be published.