首页    期刊浏览 2024年11月25日 星期一
登录注册

文章基本信息

  • 标题:Preprocessing and Morphological Analysis in Text Mining
  • 本地全文:下载
  • 作者:Mohbey, K. K. ; Tiwari, S.
  • 期刊名称:International Journal of Electronics Communication and Computer Engineering
  • 印刷版ISSN:2249-071X
  • 电子版ISSN:2278-4209
  • 出版年度:2011
  • 卷号:2
  • 期号:2
  • 页码:116-122
  • 出版社:IJECCE
  • 摘要:This paper is based on the preprocessing activities which is performed by the software or language translators before applying mining algorithms on the huge data. Text mining is an important area of Data mining and it plays a vital role for extracting useful information from the huge database or data ware house. But before applying the text mining or information extraction process, preprocessing is must because the given data or dataset have the noisy, incomplete, inconsistent, dirty and unformatted data. In this paper we try to collect the necessary requirements for preprocessing. When we complete the preprocess task then we can easily extract the knowledgful information using mining strategy. This paper also provides the information about the analysis of data like tokenization, stemming and semantic analysis like phrase recognition and parsing. This paper also collect the procedures for preprocessing data i.e. it describe that how the stemming, tokenization or parsing are applied
  • 关键词:Morphological analysis; parsing; stemming; Tokenization
国家哲学社会科学文献中心版权所有