Application of ensemble Learning in visual question-answering
Loading...
Date
2023-06-10
Journal Title
Journal ISSN
Volume Title
Publisher
University of M'sila
Abstract
Visual Question Answering (VQA) is a field that combines two different techniques:
computer vision and natural language processing. Computer vision is used to process the
image or video, and NLP uses for the processing of natural language. VQA is a technology
that automatically answers the question based on the context of images or videos. The VQA is
one of the Vision-language tasks that require a high level of language and image
understanding, making this a difficult and complex problem. In this dissertation, we explore
and apply an ensemble of diverse VQA models combined with Weighted Average techniques
to increase the accuracy.
Description
Keywords
Deep learning, CNN, LSTM, VQA, Ensemble learning, ResNet ,Computer vision, Natural language processing.