首页    期刊浏览 2025年05月26日 星期一
登录注册

文章基本信息

  • 标题:Adaptive Fusion Techniques for Multimodal Data
  • 本地全文:下载
  • 作者:Gaurav Sahu ; Olga Vechtomova
  • 期刊名称:Conference on European Chapter of the Association for Computational Linguistics (EACL)
  • 出版年度:2021
  • 卷号:2021
  • 页码:3156-3166
  • DOI:10.18653/v1/2021.eacl-main.275
  • 语种:French
  • 出版社:ACL Anthology
  • 摘要:Effective fusion of data from multiple modalities, such as video, speech, and text, is challenging due to the heterogeneous nature of multimodal data. In this paper, we propose adaptive fusion techniques that aim to model context from different modalities effectively. Instead of defining a deterministic fusion operation, such as concatenation, for the network, we let the network decide “how” to combine a given set of multimodal features more effectively. We propose two networks: 1) Auto-Fusion, which learns to compress information from different modalities while preserving the context, and 2) GAN-Fusion, which regularizes the learned latent space given context from complementing modalities. A quantitative evaluation on the tasks of multimodal machine translation and emotion recognition suggests that our lightweight, adaptive networks can better model context from other modalities than existing methods, many of which employ massive transformer-based networks.
国家哲学社会科学文献中心版权所有