期刊名称:International Journal of Electrical and Computer Engineering
电子版ISSN:2088-8708
出版年度:2020
卷号:10
期号:3
页码:2934-2943
DOI:10.11591/ijece.v10i3.pp2934-2943
出版社:Institute of Advanced Engineering and Science (IAES)
摘要:Explicit Semantic Analysis (ESA) is an approach to measure the semantic relatedness between terms or documents based on similarities to documents of a references corpus usually Wikipedia. ESA usage has received tremendous attention in the field of natural language processing NLP and information retrieval. However, ESA utilizes a huge Wikipedia index matrix in its interpretation by multiplying a large matrix by a term vector to produce a high-dimensional vector. Consequently, the ESA process is too expensive in interpretation and similarity steps. Therefore, the efficiency of ESA will slow down because we lose a lot of time in unnecessary operations. This paper propose enhancements to ESA called optimize-ESA that reduce the dimension at the interpretation stage by computing the semantic similarity in a specific domain. The experimental results show clearly that our method correlates much better with human judgement than the full version ESA approach.
关键词:Explicit semantic analysis ESA;Natural language processing NLP;Semantic relatedness;Semantic similarity