文章基本信息

标题：HOW TO APPROACH DATA ANALYSIS OF TEXTS
本地全文：下载
作者：Mladenič, Dunja
期刊名称：Journal of Information and Organizational Sciences
印刷版ISSN：1846-3312
电子版ISSN：1846-9418
出版年度：2004
卷号：28
期号：1-2
页码：123-134
出版社：Faculty of Organization and Informatics University of Zagreb
摘要：Analysis of large text data sets is gaining popularity providing the users some insights into their own (potentially even very unstructured) data sets that where difficult to get using the standard methods. This kind of data analysis differs from the standard analysis in the following three directions: (1) the used methods for data analysis differ from the standard statistical methods, (2) the data we are analyzing have different characteristics than the standard, structured data bases, and (3) the users of the data analysis results have different needs and requirements than the usual users of common analytical services (statistics, data-mining, OLAP). This paper gives a brief idea of the area addressing that kind of data analysis commonly referred to as Text-Mining. It is a growing area placed at the intersection of Information-Retrival (IR), Data-Mining (DM), Machine-Learning (ML), Natural-Language-Processing (NLP). The problems usually addressed in Text-Mining are topic detection and tracking, document categorization, visualization of document collections, user profiling, information extraction, construction and updating of hierarchical indices and document collections, intelligent search.
关键词：text data analysis; data mining; example applications of text mining; personalized information delivery