首页    期刊浏览 2024年11月24日 星期日
登录注册

文章基本信息

  • 标题:A Hybrid Approach for Urdu Sentence Boundary Disambiguation
  • 本地全文:下载
  • 作者:Zobia Rehman ; Waqas Anwar
  • 期刊名称:The International Arab Journal of Information Technology
  • 印刷版ISSN:1683-3198
  • 出版年度:2012
  • 卷号:9
  • 期号:3
  • 出版社:Zarqa Private University
  • 摘要:Sentence boundary identification is a preliminary step for preparing a text document for Natural Language Processing tasks, e.g., machine translation, POS tagging, text summarization and etc. We present a hybrid approach for Urdu sentence boundary disambiguation comprising of unigram statistical model and rule based algorithm. After implementing this approach, we obtained 99.48% precision, 86.35% recall and 92.45% F1-Measure while keeping training and testing data different from each other, and with same training and testing data, we obtained 99.36% precision, 96.45% recall and 97.89% F1-Measure
  • 关键词:Sentence boundary disambiguation; and unigram model
国家哲学社会科学文献中心版权所有