首页    期刊浏览 2025年02月20日 星期四
登录注册

文章基本信息

  • 标题:DYNAMIC STOPLIST GENERATOR FROM TRADITIONAL INDONESIAN CUISINE WITH STATISTICAL APPROACH
  • 本地全文:下载
  • 作者:SETYAWAN WIBISONO ; MARDI SISWO UTOMO
  • 期刊名称:Journal of Theoretical and Applied Information Technology
  • 印刷版ISSN:1992-8645
  • 电子版ISSN:1817-3195
  • 出版年度:2016
  • 卷号:87
  • 期号:1
  • 出版社:Journal of Theoretical and Applied
  • 摘要:Stoplist is one part of input for information retrieval system that can affect information retrieval quality. The existence of words that are not meaningful can make the retrieval declining. The standart dictionary-based information stoplist also has problem when implemented in a corpus with specific domains. For example the word "recipe" is not a stopword but when using it in domain cuisine, recipes will appear in almost every document. We build dynamic stoplist using Indonesian recipes documents, this documents has non standart dictionary-based stoplist that interesting to study. This paper use three methods to generate stoplist. We use poisson and binomial probability distribution approach and we also use simple frequency distribution approach for classifying candidate stopword. For measuring the result we also employ recently proposed RAKE algorithm. All three of these methods have the same weakness, the stoplist can be generated appropriately if the entire population of the all corpus vocabulary has processed, unlike the stoplist dictionary which can already detect stopword at the stage of pre-processing. The results of the frequency distribution is better than the other methods, but this method requires a longer process than poisson and negative binomial method.
  • 关键词:Keyword Extraction; Indonesian Cuisine;Auto Generated Stoplist
国家哲学社会科学文献中心版权所有