首页    期刊浏览 2024年09月20日 星期五
登录注册

文章基本信息

  • 标题:Data Mining: a Healthy Tool for Your Information Retrieval and Text Mining
  • 本地全文:下载
  • 作者:Santosh Kumar Rath ; Manobendu Kesari Jena ; Tapaswini Nayak
  • 期刊名称:International Journal of Computer Science and Information Technologies
  • 电子版ISSN:0975-9646
  • 出版年度:2011
  • 卷号:2
  • 期号:5
  • 页码:2042-2045
  • 出版社:TechScience Publications
  • 摘要:Data Warehousing and Data Mining are widely used by many industries like banking, insurance, healthcare, security and many others, however very little work has been done for Text-mining. Text mining involves the application of techniques from areas such as information retrieval, natural language processing, information extraction and data mining. In this paper we describe text mining as a truly interdisciplinary method drawing on information retrieval, machine learning, statistics, computational linguistics and especially data mining. We first give a short sketch of these methods and then define text mining in relation to them. Later sections survey state of the art approaches for the main analysis tasks preprocessing, classification, clustering, information extraction and visualization. The last section exemplifies text mining in the context of a number of successful applications. Text mining offers a solution to this problem by replacing or supplementing the human reader with automatic systems Undeterred by the text explosion. It involves analyzing a large collection of documents to discover previously unknown information. The information might be relationships or patterns that are buried in the document collection and which would otherwise be extremely difficult, if not impossible, to discover. Text mining can be used to analyze natural language documents about any subject, although much of the interest at present is coming from the biological sciences
国家哲学社会科学文献中心版权所有