期刊名称:Journal of Theoretical and Applied Information Technology
印刷版ISSN:1992-8645
电子版ISSN:1817-3195
出版年度:2019
卷号:97
期号:23
页码:3536-3544
出版社:Journal of Theoretical and Applied
摘要:Measuring semantic relatedness between sentences has always been a major point of discussion for NLP researchers. Semantic relatedness measures are key factors in text intelligence applications as paraphrase detection, short answer grading and information retrieval. This work highlights the effect of investing multiple similarity features by presenting a hybrid multi-layer system where each layer outputs a different independent similarity feature that are then merged using a simple machine learning model to predict text relatedness score. The system layers cover string-oriented, corpus-oriented, knowledge-oriented and sentences embeddings similarity measures. The proposed model has been tested on Sick data set that contains 9840 English sentence pairs. Experiments confirmed that using multiple similarity features is significantly better than applying each measure separately.
关键词:Semantic Relatedness; Sentence Embeddings; Text Similarity; Skip;Thought Vector; InferSent