期刊名称:APSIPA Transactions on Signal and Information Processing
印刷版ISSN:2048-7703
电子版ISSN:2048-7703
出版年度:2019
卷号:8
页码:1-14
DOI:10.1017/ATSIP.2019.12
出版社:Cambridge University Press
摘要:Extensive evaluation on a large number of word embedding models for language processing applications is conducted in this work. First, we introduce popular word embedding models and discuss desired properties of word models and evaluation methods (or evaluators). Then, we categorize evaluators into intrinsic and extrinsic two types. Intrinsic evaluators test the quality of a representation independent of specific natural language processing tasks while extrinsic evaluators use word embeddings as input features to a downstream task and measure changes in performance metrics specific to that task. We report experimental results of intrinsic and extrinsic evaluators on six word embedding models. It is shown that different evaluators focus on different aspects of word models, and some are more correlated with natural language processing tasks. Finally, we adopt correlation analysis to study performance consistency of extrinsic and intrinsic evaluators.
关键词:Natural language processing; Word embedding; Word embedding evaluation