High-resolution images reconstruction using latent diffusion models derived from brain activity
Reconstructing visual experiences using brain activity is a unique method to better understand the way the brain perceives the world and the relationship between computer vision models. Reconstructing realistic images with high-fidelity semantics is a difficult problem, even though deep generative models were recently used for this task. We propose a new technique based on a diffusion (DM) model to reconstruct images derived from brain activity measured by functional magnetic resonance imaging. We rely, more specifically, on a latent diffussion model (LDM), called Stable Diffusion. This model lowers the computational costs of DMs while maintaining their high generative performances. We characterize the LDM’s inner mechanisms by studying the relationship between its various components (such the latent vector for the image Z, the conditioning inputs C and the different elements of denoising UNet). Our proposed method is able to reconstruct high-resolution, high-fidelity images in a straightforward manner without any need for additional training or fine-tuning complex deep-learning model. From a neuroscientific point of view, we also provide a quantitative interpretation of the different LDM components. Our study provides a framework to understand DMs and proposes a method that is promising for reconstructing images using human brain activity. Visit our website at https://www.acs.org/.
The authors have not declared any competing interests.