Mood-aware visual question answering. - Citegraph

Paper Info

Title
Mood-aware visual question answering.

Abstract
The concept of Visual Question Answering (VQA) has recently attracted the attention of many researchers in the field of machine learning. Different attention models have been proposed in VQA for the purpose of addressing the need to focus on local regions of an image. This paper proposes the concept of Mood-Aware Visual Question Answering (MAVQA) using novel long short term memory (LSTM) and convolutional neural network (CNN) attention models that combine the local image features, the question and the mood detected from the particular region of the image to produce a mood-based answer using a pre-processed image dataset. The attention mechanisms serve to enable the VQA model to only focus on parts of the image that are relevant to both the detected mood and the key words in the question. The irrelevant parts of the image are ignored, thus improving classification accuracy by reducing the chances of predicting wrong answers. Whereas previous efforts have utilized CNN mostly for the embedding of images and text, we formulate a CNN attention algorithm for the image, question and mood. The more direct convolutional attention operation is more efficient and effective, when the number of views and kernel length are optimized, than the winding recurrent LSTM attention operation. The experimental results prove that MAVQA is effectively mood-aware, and the accuracy levels of our LSTM attention model are well within the range of previous conventional VQA benchmarks, while our novel CNN attention model outperforms the previous baselines in several instances. The additional attention on the mood does not only improve classification accuracy but also substantially contributes towards the analysis and comprehension of image features, a key development in modern artificial intelligence.

Year	DOI	Venue
2019	10.1016/j.neucom.2018.11.049	Neurocomputing
Keywords	Field	DocType
Mood-aware,Visual question answering,Attention model,Long short term memory,Convolutional neural network	Kernel (linear algebra),Mood,Question answering,Embedding,Convolutional neural network,Feature (computer vision),Attention model,Artificial intelligence,Comprehension,Machine learning,Mathematics	Journal
Volume	ISSN	Citations
330	0925-2312	2
PageRank	References	Authors
0.35	42	5

Authors (5 rows)

Cited by (2 rows)

References (42 rows)

Name	Order	Citations	PageRank
Nelson Ruwa	1	4	1.73
Qirong Mao	2	261	34.29
Liangjun Wang	3	7	1.79
Jianping Gou	4	116	24.01
Ming Dong	5	105	13.70

1