首页    期刊浏览 2024年12月01日 星期日
登录注册

文章基本信息

  • 标题:FST Based Morphological Analyzer for Hindi Language
  • 本地全文:下载
  • 作者:Deepak Kumar ; Manjeet Singh ; Seema Shukla
  • 期刊名称:International Journal of Computer Science Issues
  • 印刷版ISSN:1694-0784
  • 电子版ISSN:1694-0814
  • 出版年度:2012
  • 卷号:9
  • 期号:4
  • 出版社:IJCSI Press
  • 摘要:Hindi being a highly inflectional language, FST (Finite State Transducer) based approach is most efficient for developing a morphological analyzer for this language. The work presented in this paper uses the SFST (Stuttgart Finite State Transducer) tool for generating the FST. A lexicon of root words is created. Rules are then added for generating inflectional and derivational words from these root words. The Morph Analyzer developed was used in a Part Of Speech (POS) Tagger based on Stanford POS Tagger. The system was first trained using a manually tagged corpus and MAXENT (Maximum Entropy) approach of Stanford POS tagger was then used for tagging input sentences. The morphological analyzer gives approximately 97% correct results. POS tagger gives an accuracy of approximately 87% for the sentences that have the words known to the trained model file, and 80% accuracy for the sentences that have the words unknown to the trained model file.
  • 关键词:Morphological Analyzer; Finite State Transducer; POS Tagger; Lexicon Generator.
国家哲学社会科学文献中心版权所有