文章基本信息

标题：Survey of Hyponym Relation Extraction from Hyperlinks Using Motif Patterns with Feature Combination Extraction Model
本地全文：下载
作者：G.Keerthiga ; A.Gokilavani
期刊名称：International Journal of Innovative Research in Computer and Communication Engineering
印刷版ISSN：2320-9798
电子版ISSN：2320-9801
出版年度：2019
卷号：7
期号：2
页码：1054-1060
DOI：10.15680/IJIRCCE.2019. 0702092
出版社：S&S Publications
摘要：This paper presents a method for measuring the semantic similarity between concepts in Knowledge Graphs (KGs) such as WordNet and DBpedia. Previous work on semantic similarity methods have focused on either the structure of the semantic network between concepts (e.g., path length and depth), or only on the Information Content (IC) of concepts. We propose a semantic similarity method, namely wpath, to combine these two approaches, using IC to weight the shortest path length between concepts. Conventional corpus-based IC is computed from the distributions of concepts over textual corpus, which is required to prepare a domain corpus containing annotated concepts and has high computational cost. As instances are already extracted from textual corpus and annotated by concepts in KGs, graph-based IC is proposed to compute IC based on the distributions of concepts over instances. Measuring the similarity between documents is an important operation in the text processing field. This project proposed a new similarity measure. Discovering hyponym relations among domain-specific terms is a fundamental task in taxonomy learning and knowledge acquisition. However, the great diversity of various domain corpora and the lack of labeled training sets make this task very challenging for conventional methods that are based on text content. The hyperlink structure of Wikipedia article pages was found to contain recurring network motifs in this study, indicating the probability of a hyperlink being a hyponym hyperlink. Hence, a novel hyponym relation extraction approach based on the network motifs of Wikipedia hyperlinks was proposed. This approach automatically constructs motif-based features from the hyperlink structure of a domain; every hyperlink is mapped to a 13-dimensional feature vector based on the 13 types of three-node motifs.
关键词：Motif Relationship; Hyponym Relationship; Semantic Similarity;