文章基本信息

标题：Towards the sense disambiguation of Afan Oromo words using hybrid approach (unsupervised machine learning and rule based)
本地全文：下载
作者：Workineh Tesema ; Debela Tesfaye ; Teferi Kibebew 等
期刊名称：Ethiopian Journal of Education and Sciences
印刷版ISSN：1998-8907
出版年度：2016
卷号：12
期号：1
页码：61-77
出版社：African Journals Online
摘要：
This study was conducted to investigate Afan Oromo Word Sense Disambiguation which is a technique in the field of Natural Language Processing where the main task is to find the appropriate sense in which ambiguous word occurs in a particular context. A word may have multiple senses and the problem is to find out which particular sense is appropriate in a given context. Hence, this study presents a Word Sense Disambiguation strategy which combines an unsupervised approach that exploits sense in a corpus and manually crafted rule. The idea behind the approach is to overcome a bottleneck of training data. In this study, the context of a given word is captured using term co-occurrences within a defined window size of words. The similar contexts of a given senses of ambiguous word are clustered using hierarchical and partitional clustering. Each cluster representing a unique sense. Some ambiguous words have two senses to the five senses. The optimal window sizes for extracting semantic contexts is window 1 and 2 words to the right and left of the ambiguous word. The result argued that WSD yields an accuracy of 56.2% in Unsupervised Machine learning and 65.5% in Hybrid Approach. Based on this, the integration of deep linguistic knowledge with machine learning improves disambiguation accuracy. The achieved result was encouraging; despite it is less resource requirement. Yet; further experiments using different approaches that extend this work are needed for a better performance.
关键词：Afan Oromo; Ambiguous Word; Hybrid; Rule Based; Word Sense
Disambiguation