摘要:AbstractWord intuition is speakers’ intuitive knowledge on wordhood. Collective word intuition is the word intuition of the whole language community. Given this definition, the optimal word segmentation result in Chinese NLP should reflect collective word intuition. It is also believed that an ideal definition of Chinese word should accord with the collective word intuition of Chinese speakers. To test the validity and feasibility of modeling collective word intuition, it is important to know to what extent Chinese speakers agree with each other on what is a word. In this study, we measured word intuition agreement using Mechanical Turk-based Chinese word segmentation experiment. Three metrics were used: proportionate agreement, Cohen’s kappa, and Fleiss’ kappa. The results show that Chinese speakers agree with each other almost perfectly on what is a word. And we found no evidence to support an effect of semantic transparency on word intuition agreement. Such high word intuition agreement among Chinese speakers supports the psychological reality of Chinese word and also suggests that that it is quite feasible to formulate a definition of Chinese word by modeling the collective word intuition of Chinese speakers.
关键词:Word intuition;Word segmentation;Word intuition agreement;Semantic transparency;Mechanical Turk