Application of ensemble Learning in visual question-answering

Khoudour, aya nor elhouda; Nasri, nesrine; Supervisor: Debbi, Hicham

Application of ensemble Learning in visual question-answering

Files

28_application of ensemble learning in vqa.pdf (3.16 MB)

Date

2023-06-10

Authors

Khoudour, aya nor elhouda

Nasri, nesrine

Supervisor: Debbi, Hicham

Publisher

University of M'sila

Abstract

Visual Question Answering (VQA) is a field that combines two different techniques: computer vision and natural language processing. Computer vision is used to process the image or video, and NLP uses for the processing of natural language. VQA is a technology that automatically answers the question based on the context of images or videos. The VQA is one of the Vision-language tasks that require a high level of language and image understanding, making this a difficult and complex problem. In this dissertation, we explore and apply an ensemble of diverse VQA models combined with Weighted Average techniques to increase the accuracy.

Keywords

Deep learning, CNN, LSTM, VQA, Ensemble learning, ResNet ,Computer vision, Natural language processing.

URI

http://dspace.univ-msila.dz:8080//xmlui/handle/123456789/40704

Collections

Master Thesis

Full item page

Application of ensemble Learning in visual question-answering

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections