首页    期刊浏览 2024年11月29日 星期五
登录注册

文章基本信息

  • 标题:A Survey of Common Stemming Techniques and Existing Stemmers for Indian Languages
  • 本地全文:下载
  • 作者:Gupta, Vishal ; Lehal, Gurpreet Singh
  • 期刊名称:Journal of Emerging Technologies in Web Intelligence
  • 印刷版ISSN:1798-0461
  • 出版年度:2013
  • 卷号:5
  • 期号:2
  • 页码:157-161
  • DOI:10.4304/jetwi.5.2.157-161
  • 语种:English
  • 出版社:Academy Publisher
  • 摘要:Stemming is an operation that relates morphological variants of a word. The purpose of stemming is to obtain the stem or radix of those words which are not found in dictionary. If stemmed word is present in dictionary, then that is a genuine word, otherwise it may be proper name or some invalid word. Stemming is the process for reducing inflected or sometimes derived words to their stem, base or root form, generally a written word form. The stem need not be identical to the morphological root of the word, it is usually sufficient that related words map to the same stem, even if this stem is not in itself a valid root. Stemming is used in Information Retrieval systems to improve performance. The design of stemmers is language specific, and requires some to significant linguistic expertise in the language, as well as the understanding of the needs for a spelling checker for that language. A stemmer’s performance and effectiveness in applications such as spelling checker vary across languages. A typical simple stemmer algorithm involves removing suffixes using a list of frequent suffixes, while a more complex one would use morphological knowledge to derive a stem from the words. In this paper a survey of common stemming techniques and existing stemmers for Indian languages have been presented.
  • 关键词:stemmer;stemming techniques;Indian stemmers;suffix removal
国家哲学社会科学文献中心版权所有