Mind Video

Use case: Thought to video
Tags: generator, Stable Diffusion, thought, video
No Pricing

Mind Video: From Brain Activity to Visual Experiences

Introducing Mind Video, an advanced AI tool that sets out to reconstruct high-quality videos from brain activity data captured through continuous functional magnetic resonance imaging (fMRI). Building on the groundbreaking foundation of Mind-Vis, Mind Video takes on the intricate challenge of translating non-invasive brain recordings into seamless visual experiences in the form of videos.

Mind Video employs a sophisticated two-module pipeline that serves as the bridge connecting image and video brain decoding. In the initial module, the focus is on acquiring a deep understanding of general visual fMRI features. This is accomplished through extensive unsupervised learning, employing masked brain modeling and spatiotemporal attention mechanisms. Subsequently, semantic-related features are distilled using multimodal contrastive learning, further enhanced with insights from an annotated dataset.

The second module, a marvel of innovation, fine-tunes the acquired features by co-training with an augmented stable diffusion model—a model meticulously designed for video generation, all guided by fMRI data.

Mind Video’s distinctive strength resides in its flexible and adaptable pipeline, featuring an fMRI encoder and an augmented stable diffusion model that undergo separate training phases before being masterfully fine-tuned in unison. The intelligent application of a progressive learning scheme allows the encoder to systematically acquire brain features across multiple stages. The outcome is nothing short of astonishing—videos that exhibit remarkable semantic accuracy, capturing intricate motions, and the dynamics of scenes. This achievement surpasses the benchmarks set by previous state-of-the-art approaches.

The scrutiny of attention within the transformers responsible for decoding fMRI data yields fascinating insights. It unveils the dominant role of the visual cortex in processing visual spatiotemporal information, shedding light on the hierarchical nature of the encoder’s layers in extracting structural and abstract visual features. With each training stage, the fMRI encoder progressively augments its ability to assimilate nuanced semantic information, further enriching the video reconstruction process.

Mind Video is underpinned by data sourced from the Human Connectome Project, and it extends heartfelt acknowledgment to its devoted collaborators and supporters who have been instrumental in the development of this extraordinary tool.

As part of our community you may report an AI as dead or alive to keep our community safe, up-to-date and accurate.

An AI is considered “Dead AI” if the project is inactive at this moment.

An AI is considered “Alive AI” if the project is active at this moment.