出版社:The Japanese Society for Artificial Intelligence
摘要:Graph construction is an important step in graph-based semi-supervised classification. While the k-nearest neighbor graphs have been the de facto standard method of graph construction, this paper advocates using the less well-known mutual k-nearest neighbor graphs for high-dimensional natural language data. To evaluate the quality of the graphs apart from classification algortihms, we measure the assortativity of graphs. In addition, to compare the performance of these two graph construction methods, we run semi-supervised classification methods on both graphs in word sense disambiguation and document classification tasks. The experimental results show that the mutual k-nearest neighbor graphs, if combined with maximum spanning trees, consistently outperform the k-nearest neighbor graphs. We attribute better performance of the mutual k-nearest neighbor graph to its being more resistive to making hub vertices. The mutual k-nearest neighbor graphs also perform equally well or even better in comparison to the state-of-the-art b-matching graph construction, despite their lower computational complexity.
关键词:semi-supervised learning ; classification ; graph construction ; document classification ; word sense disambiguation