REVIEW ON HUMAN VISION RECONSTRUCTION USING BRAIN
K.Rathi10000-0003-4000-6338 and Dr.V.Gomathi20000-0003-3639-485X
1 Research Scholar, National Engineering College, kovilpatti,Tamilnadu, India 2 Head of Department/ CSE, National Engineering College, kovilpatti,Tamilnadu, India
[email protected],[email protected]
Reconstructing the human brain activities through functional Magnetic resonance imaging has developed in recent
trends. In traditional study, the image or stimuli is decoded with noise in fMRI data and it has high computational
complexity. To visualize perceptual content from the recorded brain activity in the form of voxel / EEG to pixel mapping
and to reduce high computational complexity using parallel algorithms.
Cognitive neuroscience is an interdisciplinary study area of psychology and neuroscience. An interesting
research field in this domain is building mathematical model on how the psychological activities are correlated
to the physiological neural circuitry of human.
EEG is a traditional and non-invasive way of monitoring electrical activity of the brain by following Intl.10-20
system. The EEG signals normally recorded using special EEG sensors EMOTIV wireless Kit, Cognionics
wearable EEG Cap, Bio Semi with essential temporal resolutions in terms of data sampling rate (min. 128 –
512 samples/sec) and positioning of minimum 2 to 256 electrodes as the EEG recording channels. Each
electrode placement site has a letter to identify the lobe, or area of the brain it is reading from: Pre-frontal (Pf),
Frontal (F), Temporal (T), Parietal (P), Occipital (O), and Central (C).
Considerable efforts have been devoted by the researchers working with EEG data to model the affect domain,
Cognitive Neuro-feedback system and solving motor imagery related tasks. Their works mainly focus the
human emotion analysis, cognitive / brain disorders, linguistic modeling, etc.
In addition to that, thefunctional magnetic resonance imaging (fMRI) measures the brain activity profile by
changing the blood flow in the form of voxels, to reconstruct the perceived stimuli directly from the fMRI
2. Related work
A multiscale local images with predefined shapes were used to reconstruct the lower order information of binary
contrast pattern1.The handwritten characters were constructed by straightforward linear Gaussian
approach2.In the proposal of reconstruction model, the visual image reconstruction has limited representation
power. It acts as a linear observation model for visual image and it’s evaluated by Bayesian canonical
correlation analysis (BCCA)3.
To improve the reconstruction accuracy of this process, the posterior regularization is helping to constrain the
testing instances and are close to their neighbors from the training set4. A nonlinear extension of the BCAA
was formulated by means of a deep generative multi-view model (DGMM)5.The technical innovations of deep
neural networks are helping to know about the hierarchical visual processing in computational neuroscience
6.The fMRI activity patterns to the DNN features of viewed images are predicted by the developed decoders
7.Encoding and decoding models are the basic approach for reconstructing the image (low base image or
exemplar image) from the human brain activity. It is not suitable for combined the multiple hierarchical level
features even though sophisticated decoding and encoding models. So its need to develop 8.
Instead of hierarchical neural representations of human visual system the DNN visual features are used in
reconstructing an image from the human brain activity. In this process fMRI pattern is decoded into DNN
features and it also produces the similar output 9.Early visual cortex of lower BOLD signal is the response to
faces the dissension view had been already presented than for the novel faces 10.fMRI is used to localize
regions in the monkey brain and its produced the stronger response to face compared to other objects, so this
region preferred for the electrophysiological analysis 11.
The right ATL and the fusiform gyrus is the set of ventral stream regions identified by the bold response (same
face with difference expression) after averaging together. It have the information about individual images of
faces12.Investigations of face identification by the functional magnetic resonance imaging it’s a homologous
investigation so it’s the main reason for the cortical source of this information attributed to fusiform gyrus.
Fusiform base face space visual features are used for facial image reconstruction. And these processes are not
considered as a temporal aspect of a face processing 13.
3. Methodology of Human Vision Reconstruction Using Brain Activity
Fig. 1. Block Diagram representation of human vision reconstruction using brain activity
The subject was seeing any stimuli; fMRI / EEG responses were obtained through the scanner. fMRI activity
produces the BLOD (Blood Oxygenation Level Dependent) signals of brain images. This image data is divided
into training set and testing set. Convolutional Neural Network (CNN) is a deep feed-forward artificial neural
network. Then the CNN feature of stimuli is predicted by the decoder. In these CNN techniques has many
The training images (32*32*3) are randomly selected from the database and it matching with the test images.
These two set images are gathered from the same scan sessions. For using the training images the mathematical
models to envision the feature maps of CNN layer. The features to images in the training set were mapped and
obtain the accurate reconstruction. The main goal is to propose a new image reconstruction method, in which the
pixel information will be correlated with deep learning approach based on latent-variable distributions. This
reconstruction method mainly depends on the observed brain activity patterns in the form of physiological
modalities fMRI (functional Magnetic Resonance Imaging).
fMRI / EEG
Decoder Brain Scanner
4.1 Deep Generative Model 14:
Visual images and fMRI activity pattern denoted as x and y respectively and also introduced the shared latent
P(z) = ;#55349;;#57099;;#55349;;#56406;=1;#3627408449; ;#55349;;#56489;(;#3627408487;;#55349;;#56406;| 0, I )
When noises are observed in the image with zero mean and diagonal covariance matrix in voxel activation.
Then the Gaussian distribution function is given by
;#55349;;#56413;;#55349;;#57091;(x|z)=;#55349;;#57099;;#55349;;#56406;=1;#3627408449; ;#55349;;#56489;(;#3627408485;;#55349;;#56406;| ;#3627409159;;#3627408485; (;#3627408487;;#55349;;#56406;), diag (;#55349;;#57102;;#3627408485;2(;#3627408487;;#55349;;#56406;)))
In fMRI activity non-linear transformation are more powerful and it’s used to suppress the noise and predict the
information. Activity pattern of fMRI has projection matrix and covariance matrix , the likelihood function is,
P(y | z) = ����=1� �� (y | ������� , ?)
In this case fMRI voxels are highly correlated. Inferring high dimensional covariance matrix ?, introduce the
auxiliary latent variable ?, the low-rank assumption model to decrease the computational complexity.
P(?) = ����=1� �� (? | 0, I )
P(y|z,? )= ����=1� �� (y | ������� + ����? , �?1 I )
KL Divergence is to measure the difference between two probability distributions over the same variables. In
variational distribution concept the KL divergence formula is,
���� (Q?P)=?�Q(z) log Q(z) / P(z | x)
From KL divergence
P(z | x) =P(x | z) P(z) / ?� P(x,z) dz
���� (Q?P)=?�Q(z)(logQ(z)/P(z|x))+log P(x)
LogP(x)= ���� (Q?P)-?�Q(z)logQ(z)/ P(z | x)
Log P(x)=E-log(Q(z) / P( z,x)) + ���� (Q?P)
LogP(x)=ElogP(z,x)-logQ(z)+ ���� (Q?P)
Continuous version of KL Divergence is,
���� (P(x)?Q(x)) = ? �(�) ln�(�)
Prediction distribution of visual images denoted as����� and the brain activity is �? , the posterior distribution is
P(�����| �? )=?��(�����| �? ) p(�? | �?) d�?
4.2 Linear Reconstruction Model 15 :
The Gaussian decoding model, parameters are evaluated in the existence of the dissimilar regularization
In Gaussian decoding, the stimulus-response pair is denoted as (x, y)
Then the forward encoding model in multivariate Gaussian with zero mean and covariance matrix(R) is,
P(y|x) ? exp(?1
The canonical form of multivariate Gaussian distribution,
X=(R-RB( ?1+������� )?1���� R)B??1y
Where, B is the regression coefficients.
The prior covariance matrix is,
�?1 (?�� (��)��)�
The minimization problem is solved by using the Regression coefficient �� ,
The closed form of Regression coefficient for all voxels is,
B?=(���� ��+��? )?1����Y
In accordance to calculate the variance for each voxel k,
��? � =(var(��) – var(��?� �)) / var(��)
4.3 Visual image reconstruction using local image decoders 16 :
To predict the mean contrast of each local image elements. Discriminant function of contrast class k in a local
decoder is expressed as,
���(r) =?��� �� ���?� + ����
Using softmax function,
��� (k|r)=exp ����(�)
The weight parameter has zero mean normal distribution with a variance, whose inverse is treated as hyper
P (��� | ���)=N(0,1
Where,p(���) = 1
����� is treated as a random variable.
The output of the local image decoder is given by,
I?(x|r) =? �� ��(��?) ���(�)��
4.4 DNN feature decoding 17 :
In this method, sparse linear regression algorithm is used to select the vital voxels for decoding.
In single DNN layer reconstruction is given below and it reduced the optimization problem.
2? (?��(�)(�)? ���(�))2�����=1
Combine DNN feature with multiple layers is given by,
��? =argmin 1
2? ��� �?�||?(�)(�)?�(�)||22
This cost function is minimized by Limited Memory BFGS algorithm. This algorithm solving unconstrained
values in non-optimization problems.
5. Comparative Analysis
Deep generative model to implement the perceived image reconstruction problem and its derive the predictive
distribution to reconstruct the visual images from brain activity and also deal with encoding tasks. This method
has high computational complexity.
Linear reconstruction model, provide the high quality of reconstruction stimuli obtained by inverting properly
encoding model. In both encoding and decoding performance work with the regression analysis.
In visual image reconstruction the constraint free visual image reconstruction based on local images with
multiple scale. This method provide the information representations in multivoxel pattern discovering from
human brain activity. Discriminantfunctions are used to solve the reconstruction problem.
DNN feature decoding method are used to minimize the cost function of layers using the algorithm of
LMBFGS.It’s solve the non-optimization problem for unconstraint values.
In these types of models, the computational complexity is high. So, our team members to work
withGPU(Graphics Processing Unit) for reducing the time for reconstructing the stimuli.
1. Yoichi Miyawaki, Hajime Uchida, Okito Yamashita, Masa-aki Sato, Yusuke Morito, Hiroki C Tanabe,
NorihiroSadato, and YukiyasuKamitani. Visual image reconstruction from human brain activity using a
combination of multiscale local image decoders. Neuron,60(5):915-929, 2008.
2. SanneSchoenmakers, Markus Barth, Tom Heskes, and Marcel van Gerven. Linear reconstruction of perceived
images from human brain activity.NeuroImage, 83:951-961,2013.
3. Yusuke Fujiwara, Yoichi Miyawaki, and YukiyasuKamitani. Modular encoding and decoding models derived
from Bayesian canonical correlation analysis. Neural computation, 25(4):979-1005, 2013.
4. Jun Zhu, Ning Chen, and Erica P Xing, Bayesian inference with posterior regularization and applications to
infinite latent svms. Journal of Machine Learning Research, 15(1):1799-1847, 2014.
5. Diederik P Kingma and Max Welling. Auto-encoding variationalbayes.In ICLR, 2014.
6. Yamins, D.L.,;Dicarlo, J.J. Using goal-driven deep learning models to understand sensory cortex. Nat. Neurosci.
7. Nguyen, A., Dosovitskiy, A., Yosinski, J., Brox, T., ;Clune, J. Synthesizing the preferred inputs for neurons in
neural networks via deep generator networks. Adv. Neural Inf. Process. Syst. 29,3387-3395, 2016.
8. Horikawa,T., ;Kamitani,Y. Generic decoding of seen and imagines objects using hierarchical visual features. Nat.
9. Mur M, Ruff DA, Bodurka J, Bandettini PA, Kriegeskorte N. Face-identity change activation outside the face
system: “Release from adaptation” may not always indicate neuronal selectivity, Cereb Cortex , 209: 2027-2042,
10. FreiwaldWA,TsaoDY.Functionalcompartmentalization and viewpoint generalization within the macaque face-
processing system, Science, 30:845-851, 2010.
11. Nestor A, Plaut DC, Behrmann M. Unraveling the distributed neural code of facial identity through
spatiotemporal pattern analysis, ProcNatlAcadSci USA , 108 : 9998-10003, 2011.
12. Anzellotti S, Fairhall SL, Caramazza A(2014) Decoding representations of face identity that are tolerant to
rotation. Cereb Cortex 24:1988-1995. CrossRef Medline.
13. Nestor A, Plaut DC, BehrmannM(2016) Feature-based face representations and image reconstruction from
behavioral and neural data. ProcNatlAcadSci USA 113:416-421. CrossRef Medline.
14. ChangdeDu,ChangyingDu,Huiguang He : Sharing deep generative representation for perceived image
reconstruction from human brain activity.
15. SanneSchoenmakers, Markus Barth, Tom Heskes, Marcel Van Gerven: Linear reconstruction of perceived images
from human brain activity. NeuroImage.YNIMG- 10695.pg:11;4C:4,6,7,8,9.(2013).
16. Hajime Uchida, Okito Yamashita, Masa-akisato : Visual image reconstruction from human brain activity using a
combination of multiscale local image decoders.Neuron 60,pg:915-929.2008.
17. Guohua Shen, Tomoyasu Horikawa, Kei Majima,andYukiyasuKamitani : Deep image reconstruction from human