Khoudour, aya nor elhoudaNasri, nesrineSupervisor: Debbi, Hicham2023-07-232023-07-232023-06-10http://dspace.univ-msila.dz:8080//xmlui/handle/123456789/40704Visual Question Answering (VQA) is a field that combines two different techniques: computer vision and natural language processing. Computer vision is used to process the image or video, and NLP uses for the processing of natural language. VQA is a technology that automatically answers the question based on the context of images or videos. The VQA is one of the Vision-language tasks that require a high level of language and image understanding, making this a difficult and complex problem. In this dissertation, we explore and apply an ensemble of diverse VQA models combined with Weighted Average techniques to increase the accuracy.enDeep learning, CNN, LSTM, VQA, Ensemble learning, ResNet ,Computer vision, Natural language processing.Application of ensemble Learning in visual question-answeringThesis