文章基本信息

标题：表層語形から品詞はどれぐらい正確に予測できるか？ ―英語形態論とチェコ語形態論の比較から―
本地全文：下载
作者：黒田航
期刊名称：認知科学
印刷版ISSN：1341-7924
电子版ISSN：1881-5995
出版年度：2015
卷号：22
期号：4
页码：621-637
DOI：10.11225/jcss.22.621
出版社：Japanese Cognitive Science Society
摘要：
The research aimed to evaluate how reliably major Parts-of-Speech (PoS) (i.e., Noun, Verb, Adjective, Adverb) can be predicted from surface word forms —more concretely, ending character n-grams (n= 2, 3, 4) of surface word forms— of English and Czech, to compare the results from the two languages. It was conducted with two objectives. First, it wanted to establish the hypothesis that the degrees to which PoS is reliably predicated from surface word forms can vary drastically among languages, though effec- tive measurement of the predictability is unimplemented yet. (If a language has a high degree of predictability of PoS from surface word forms, we can say is has a high form- function transparency in PoS recognition.) Second, it wanted to show that English is a language whose vocabulary is relatively hard to acquire, as far as a good predictabil- ity of POS from word forms facilitates vocabulary acquisition, which is admittedly an unconfirmed hypothesis, with other things being equal. Results of Formal Concept Analysis (Ganter and Wille 1999) applied to the English and Czech data suggest that ending character n-gram of English words had noticeably less predictability than ones of Czech words in terms of major PoS, i.e., N, V, Adj and Adv, because they are highly confusing in English. This means that vocabulary acqui- sition can be significantly harder in English than in Czech, if other things being equal. The results also suggest that English was one of those languages in which effective PoS recognition requires multi-word processing strategy.
关键词：形式概念分析; 形式と機能の透明性; PoS; 自動品詞認識; 比較形態論; 文字n-gram; 語彙獲得の効率; 言語距離; 言語の習得可能性