首页    期刊浏览 2024年12月01日 星期日
登录注册

文章基本信息

  • 标题:Extraction of Turkish Semantic Relation Pairs using Corpus Analysis Tool
  • 本地全文:下载
  • 作者:Gurkan Sahin ; Banu Diri ; Tugba Yildiz
  • 期刊名称:International Journal of Computer and Information Technology
  • 印刷版ISSN:2279-0764
  • 出版年度:2016
  • 卷号:5
  • 期号:6
  • 页码:491-499
  • 出版社:International Journal of Computer and Information Technology
  • 摘要:In this study, we have developed a Turkish semantic relation extraction tool. The tool takes an unparsed corpus as input and gives hyponym, meronym and antonym words with their reliability scores as output for given target words. Corpus is parsed by Turkish morphological parser called Zemberek and word vectors are created by Word2Vec for each unique word in corpus. To extract relation patterns, hyponymy, holonymy, antonymy pairs called initial seeds are prepared then, all possible relation patterns are extracted using initial seeds. Reliability of patterns are calculated using corpus statistics and various association metrics. Reliable patterns are selected to extract new semantic pairs from parsed corpus. To determine correctness of extracted pairs, total pattern frequency, different pattern frequency and Word2Vec vector cosine similarity have been used. After experiments, we have obtained 83%, 63%-86%, and 85% average precisions for hyponymy, holonymy and antonymy relations, respectively.
  • 关键词:hyponymy; holonymy; antonymy; Word2Vec; semantic relation; pattern;based approach;
国家哲学社会科学文献中心版权所有