文章基本信息

标题：Document Indexing with a Concept Hierarchy
本地全文：下载
作者：Gelbukh, Alexander ; Sidorov, Grigori ; Guzmán-Arenas, Adolfo 等
期刊名称：Computación y Systemas
印刷版ISSN：1405-5546
出版年度：2005
卷号：8
期号：4
页码：281-292
出版社：Universidad Nacional Autónoma de México, Servicios de Cómputo Académico
摘要：
Given a large hierarchical concept dictionary (thesaurus, or ontology), the task of selection of the concepts that describe the contents of a given document is considered. A statistical method of document indexing driven by such a dictionary is proposed. The method is insensible to inaccuracies in the dictionary, which allow for semi-automatic translation of the hierarchy into difíerent languages. The problem of handling non-terminal and especially top-level nodes in the hierarchy is discussed. Common sense-complaint methods of automatically assigning the weights to the nodes and links in the hierarchyare presented. The application of the method in the Classifier system is discussed.
关键词：Document Characterization; Document Comparison; Ontology; Statistical Methods.