文章基本信息

标题：Performance Analysis of Classifying Unlabeled Data from Multiple Data Sources
本地全文：下载
作者：M.Jeevan Babu ; K. Suvarna Vani
期刊名称：International Journal of Computer Science and Information Technologies
电子版ISSN：0975-9646
出版年度：2011
卷号：2
期号：4
页码：1729-1734
出版社：TechScience Publications
摘要：In many real time applications, Data acquired about the same entity by different sources is generally partially redundant, and partially complementary, since each source has different characteristics and physical interaction mechanisms are different. The information provided by a single source is incomplete resulting in misclassification. Fusion with redundant data can help reduce ambiguity, and fusion with complementary data can provide a more complete description. In both cases, classification[3] results should be better. In many domains, large amounts of unlabeled data are available in the real world data-mining tasks. However labeled data are often limited and time-consuming to generate, since labeling typically requires human expertise. Consider the above requirements; we have to classify unlabelled data acquired about same entity by different sources. But existing data mining techniques (supervised learning, unsupervised learning (associate clustering), and co-training)[2] does not give good results with these requirements. To overcome the difficulties of present data mining techniques, introduce a novel method that predicts the classification[3] of data from multiple sources[1] without class labels in each source which is called learning classification from multiple sources of unlabeled data, or simply cooperative unsupervised learning. In this project I am going to work on 2 different datasets” classification unlabelled data of multiple data sources[3]”.
关键词：new solutions for multiple data source mining; learning;from multiple sources of data; learning classifications from;unlabeled data of multiple sources