首页    期刊浏览 2024年10月07日 星期一
登录注册

文章基本信息

  • 标题:ARABIC TERM EXTRACTION USING COMBINED APPROACH ON ISLAMIC DOCUMENT
  • 本地全文:下载
  • 作者:ALI MASHAAN ABED ; SABRINA TIUN ; MOHAMMED ALBARED
  • 期刊名称:Journal of Theoretical and Applied Information Technology
  • 印刷版ISSN:1992-8645
  • 电子版ISSN:1817-3195
  • 出版年度:2013
  • 卷号:58
  • 期号:3
  • 出版社:Journal of Theoretical and Applied
  • 摘要:While a wide range of methods has been conducted to English terminology extraction, relatively few studies have been applied to Arabic terms extraction in Islamic corpus. In this paper, we present an efficient approach for automatic extraction of Arabic Terminology (SWTs, MWTs). The approach relies on two main filtering steps: the linguistic filter, where simple part of speech (POS) tagger is used to extract candidate MWTs matching given syntactic patterns, and the statistical filter where several statistical methods (PMI, Kappa, CHI-squire, T-test, Piatersky- Shapiro and Rank Aggregation) are used to rank candidate MWTs and we applied IF.IDF to rank the SWTs candidate. Our approach extracted the bi-gram candidates of MWTs Islamic term from corpus and evaluated the association measures (STWs and MWTs) by using the n-best evaluation method.
  • 关键词:Term Extraction; SWTs; MWTs; Association measures; n-best evaluation
国家哲学社会科学文献中心版权所有