WG 2 – Efficient Machine Learning algorithms, methods, and applications to language generation

WG2 focuses on efficient ML machinery behind state-of-the-art Language Generation (LG) models, since much of the improvements we observe across different LG tasks nowadays are due to the application of varied deep neural network architectures. This involves, among others:

  • multi-task learning
  • transfer learning
  • representation learning
  • structured prediction
  • generative models

Moreover, WG2 investigates integration strategies for multi-modal data, which is of critical importance for Multi3Generation.

This working group will be working on:

  • a repository of open source software for processing language and visual content, inc. a directory of sources of materials or components
  • a survey on efficient use of ML for LG
  • the organization of training schools

If you would like to join this working group, please contact the WG Leader and coleader: Aykut Erdem ( and Elena Lloret (

Publications and preprints

Journal Articles

  • Ales Zamuda and Elena Lloret . Optimizing Data Driven Models for Summarization as Parallel Tasks, Journal of Computational Science, Vol. 42, April 2020.
  • Begum Citamak, Ozan Caglayan, Menekse Kuyu, Erkut Erdem, Aykut Erdem, Pranava Madhyastha, Lucia Specia. MSVD-Turkish: A Comprehensive Multimodal Video Dataset for Integrated Vision and Language Research in Turkish, Machine Translation, Vol. 35, 265-288.
  • Emre Boran, Aykut Erdem, Nazli Ikizler-Cinbis, Erkut Erdem, Pranava Madhyastha, Lucia Specia. Leveraging auxiliary image descriptions for dense video captioning, Pattern Recognition Letters, Vol. 146, June 2021.
  • Erkut Erdem, Menekse Kuyu, Semih Yagcioglu, Anette Frank, Letitia Parcalabescu, Andrii Babii, Olek- sii Turuta, Aykut Erdem, Iacer Calixto, Barbara Plank, Elena Lloret, Elena-Simona Apostol, Ciprian- Octavian Truica ̆, Branislava Šandrih, Sanda Martinčić-Ipšić, Gábor Berend, Albert Gatt, Grazina Korvel, “Neural Natural Language Generation: A Survey on Multilinguality, Multimodality, Controllability and Learning”, Journal of Artificial Intelligence Research, Vol. 73, pp. 1131-1207, April 2022. 

Conference Papers

  • M. Sercan Amac , Semih Yagcioglu , Aykut Erdem, and Erkut Erdem. “Procedural Reasoning Networks for Understanding Multimodal Procedures”, In 23rd Conference on Computational Natural Language Learning (CoNLL)
  • J. Phang, I. Calixto, PM. Htut, Y. Pruksachatkun, H. Liu, C. Vania, K. Kann, SR. Bowman (2020). English intermediate-task training improves zero-shot cross-lingual transfer too. In AACL 2020.
  • V. Milewski , MF. Moens and I. Calixto (2020). Are scene graphs good enough to improve image captioning? In: AACL 2020.
  • Ozan Caglayan, Menekse Kuyu, Mustafa Sercan Amac, Pranava Madhyastha, Erkut Erdem, Aykut Erdem and Lucia Specia. Cross-lingual Visual Pre-training for Multimodal Machine Translation, Conference of the European Chapter of the Association for Computational Linguistics (EACL 2021), Short Papers, 2021.
  • Rob van der Goot, Ahmet Üstün, Alan Ramponi, Ibrahim Sharaf and Barbara Plank. Massive Choice, Ample Tasks (MaChAmp): A Toolkit for Multi-task Learning in NLP.In EACL 2021. Received EACL 2021 oustanding paper award (demo track)
  • Ilker Kesen, Ozan Arkan Can, Erkut Erdem, Aykut Erdem, Deniz Yuret, “Modulating Bottom-Up and Top-Down Visual Processing via Language-Conditional Filters”, 5th Multimodal Learning and Applications Workshop (MULA 2022) – in conjunction with CVPR 2022 (Best Paper Award), New Orleans, USA, June 2022.
  • Anabela Barreiro, José GC de Souza, Albert Gatt, Mehul Bhatt, Elena Lloret, Aykut Erdem, Dimitra Gkatzia, Helena Moniz, Irene Russo, Fabio Kepler, Iacer Calixto, Marcin Paprzycki, François Portet, Isabelle Augenstein, Mirela Alhasani, “Multi3Generation: Multitask, Multilingual, Multimodal Language Generation”, 23rd Annual Conference of the European Association for Machine Translation (EAMT 2022), Ghent, Belgium, June 2022. 
  • Tayfun Ates, Muhammed Samil Atesoglu, Cagatay Yigit, Ilker Kesen, Mert Kobas, Erkut Erdem, Aykut Erdem, Tilbe Goksun, Deniz Yuret, “CRAFT: A Benchmark for Causal Reasoning About Forces and inTeractions”, Findings of the Association for Computational Linguistics (ACL 2022), May 2022. 
  • Parcalabescu, L., Cafagna, M., Muradjan, L., Frank, A., Calixto, I. and Gatt, A., 2022, May. VALSE: A Task-Independent Benchmark for Vision and Language Models Centered on Linguistic Phenomena. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (pp. 8253-8280). 
  • Patrick Fernandes, António Farinhas, Ricardo Rei, José G. C. de Souza, Perez Ogayo, Graham Neubig, Andre Martins. Quality-Aware Decoding for Neural Machine Translation  Meeting of the North American Chapter of the Association for Computational Linguistics (NAACL). July 2022. 
  • Ricardo Rei, Ana C Farinha, José G.C. de Souza, Pedro G. Ramos, André F.T. Martins, Luisa Coheur, Alon Lavie. Searching for COMETINHO: The Little Metric That Could. ​​In Proceedings of the 23rd Annual Conference of the European Association for Machine Translation, pages 61–70, Ghent, Belgium. European Association for Machine Translation


Dataset name and brief description, including purposeAuthors/creatorsLink
MSVD-Turkish: The first large scale video captioning dataset for Turkish languages, obtained by carefully translating the English descriptions of the videos in the MSVD (Microsoft Research Video Description Corpus) dataset into Turkish.Begum Citamak, Ozan Caglayan, Menekse Kuyu, Erkut Erdem, Aykut Erdem, Pranava Madhyastha, and Lucia Specia.MSVD-Turkish


Name and/or brief descriptionAuthors/creatorsLink
Procedural Reasoning NetworksAmac, M. Sercan, Yagcioglu, Semih, Erdem, Aykut, and Erdem, Erkut

Open source repository

Neural Natural Language Generation (GitHub)

Other outputs

Add anything here that doesn’t clearly fall under the other headings.

Skip to content