Benchmark - Multi3Generation

As a result of the collaboration between Multi3Generation members, the VILMA benchmark was produced. The paper and resource can be found here:

Paper: https://openreview.net/forum?id=liuqDwmbQJ
Arxiv: https://arxiv.org/abs/2311.07022
Website: https://cyberiada.github.io/ViLMA
Codebase: https://github.com/ilkerkesen/ViLMA

Published as a conference paper at ICLR 2024

Reference: Ilker Kesen, Andrea Pedrotti, Mustafa Dogan, Michele Cafagna, Emre Can Acikgoz, Letitia Parcalabescu, Iacer Calixto, Anette Frank, Albert Gatt, Aykut Erdem, Erkut Erdem
ViLMA: A Zero-Shot Benchmark for Linguistic and Temporal Grounding in Video-Language Models