文章基本信息

标题：Multi-Layer Contextual Passage Term Embedding for Ad-Hoc Retrieval
本地全文：下载
作者：Weihong Cai ; Zijun Hu ; Yalan Luo 等
期刊名称：Information
电子版ISSN：2078-2489
出版年度：2022
卷号：13
期号：5
页码：221
DOI：10.3390/info13050221
语种：English
出版社：MDPI Publishing
摘要：Nowadays, pre-trained language models such as Bidirectional Encoder Representations from Transformer (BERT) are becoming a basic building block in Information Retrieval tasks. Nevertheless, there are several limitations when applying BERT to the query-document matching task: (1) relevance assessments are applicable at the document-level, and the tokens of documents often exceed the maximum input length of BERT; (2) applying BERT to long documents leads to a great consumption of memory usage and run time, owing to the computational cost of the interactions between tokens. This paper explores a novel multi-layer contextual passage architecture that leverage text summarization extraction to generate passage-level evidence for the pre-selected document passage thus brought new possibilities for the long document relevance task. Experiments were conducted on two standard ad-hoc retrieval collections from the Text Retrieval Conference (TREC) 2004 Robust Track (Robust04) and ClueWeb09 with two different characteristics individually. Experimental results show that our approach can significantly outperform the strong baselines and even compared with the same BERT-based models, the precision of our methods as well as state-of-the-art neural ranking models.