首页    期刊浏览 2024年11月30日 星期六
登录注册

文章基本信息

  • 标题:Discovering suffixes: A Case Study for Marathi Language
  • 本地全文:下载
  • 作者:Mudassar M. Majgaonker ; Tanveer J Siddiqui
  • 期刊名称:International Journal on Computer Science and Engineering
  • 印刷版ISSN:2229-5631
  • 电子版ISSN:0975-3397
  • 出版年度:2010
  • 卷号:2
  • 期号:8
  • 页码:2716-2720
  • 出版社:Engg Journals Publications
  • 摘要:Suffix stripping is a pre-processing step required in a number of natural language processing applications. Stemmer is a tool used to perform this step. This paper presents and evaluates a rule-based and an unsupervised Marathi stemmer. The rule-based stemmer uses a set of manually extracted suffix stripping rules whereas the unsupervised approach learns suffixes automatically from a set of words extracted from raw Marathi text. The performance of both the stemmers has been compared on a test dataset consisting of 1500 manually stemmed word.
  • 关键词:component; Marathi morphology; Marathi stemmer; Unsupervised stemmer; Rule-based stemmer; Natural language processing
国家哲学社会科学文献中心版权所有