期刊名称:International Journal of Computer Science Issues
印刷版ISSN:1694-0784
电子版ISSN:1694-0814
出版年度:2012
卷号:9
期号:2
出版社:IJCSI Press
摘要:This paper presents an approach for automatically annotating document segments within information rich texts using a domain ontology. The work exploits the logical structure of input documents in order to achieve its task. The underlying assumption behind this work is that segments in such documents embody self contained informative units. Another assumption is that segment headings coupled with a documents hierarchical structure offer informal representations of segment content; and that matching segment headings to concepts in an ontology/thesaurus can result in the creation of formal labels/meta-data for these segments. A series of experiments was carried out using the presented approach on a set of Arabic agricultural extension documents. The results of carrying out these experiments demonstrate that the proposed approach is capable of automatically annotating segments with concepts that describe a segments content with a high degree of accuracy.