出版社:The Japanese Society for Artificial Intelligence
摘要:Bootstrapping has a tendency, called semantic drift , to select instances unrelated to the seed instances as the iteration proceeds. We demonstrate the semantic drift of Espresso-style bootstrapping has the same root as the topic drift of Kleinberg's HITS, using a simplified graph-based reformulation of bootstrapping. We confirm that two graph-based algorithms, the von Neumann kernels and the regularized Laplacian, can reduce the effect of semantic drift in the task of word sense disambiguation (WSD) on Senseval-3 English Lexical Sample Task. Proposed algorithms achieve superior performance to Espresso and previous graph-based WSD methods, even though the proposed algorithms have less parameters and are easy to calibrate.
关键词:Bootstrapping ; Link Analysis ; HITS ; Regularized Laplacian ; von Neumann Kernel ; Word Sense Disambiguation ; Semi-supervised Learning