期刊名称:International Journal of Computer Science & Technology
印刷版ISSN:2229-4333
电子版ISSN:0976-8491
出版年度:2013
卷号:4
期号:4
页码:135-140
语种:English
出版社:Ayushmaan Technologies
摘要:Information Extraction (IE) is the task of automatically extracting structured information from unstructured and/or semi-structured machine-readable documents. In most of the cases this activity concerns processing human language texts by means of natural Language Processing (NLP). Recent activities in multimedia document processing like automatic annotation and content extraction out of images/audio/video could be seen as information extraction. It has been found that application of incremental clustering has not been done very much in message extraction in the field of processing of event logs of any machine. Since event logs are the data sets which grow by the time and require that the clusters must be updated in real time with the growth of the event log. Since event logs are the huge set of datasets and grow rapidly as the system processing goes on, therefore, for extraction of data from it is variable and should grow by the time. In this work, it is being proposed to apply the incremental clustering to extract the data from the event log as per the characteristics provided by the users of the system. Incremental Clustering requires initial clusters to be decided in advance i.e. they must pre exist for processing. If the initial clusters are to be fixed, then there are several ways it can be achieved. The algorithm being proposed is a dynamic and novice algorithm for deciding the initial clusters dynamically.
关键词:Incremental Clustering;Data Mining;System Log;Clustering; Message Type Mining