Object Detection task in Visual Question Answering

Loading...
Thumbnail Image

Date

2023-06-10

Journal Title

Journal ISSN

Volume Title

Publisher

University of M'sila

Abstract

This research proposes a novel approach for Visual Question Answering (VQA) by incorporating object detection features into the model as image features instead of traditional CNN features. The aim is to leverage specific information about objects present in the image to improve the VQA task. The experiments yielded accuracy values of 76% for Yes/No questions, 43% for counting questions, and 47% for other questions. Overall, this research enhances the understanding and processing of visual information by incorporating object detection features, leading to improved accuracy and performance in answering questions based on images.

Description

Keywords

Visual Question Answering (VQA), object detection, image features, CNN, VQA v2 dataset.

Citation

Collections