期刊名称:International Journal on Computer Science and Engineering
印刷版ISSN:2229-5631
电子版ISSN:0975-3397
出版年度:2010
卷号:2
期号:3
页码:782-785
出版社:Engg Journals Publications
摘要:This paper proposes a statistical approach by a modified Markov chain process model and entropy function in the analysis of a large data set. The basic idea is that entropy and conditional entropy are used to measure the information content. In such analysis of large data sets including signal and image processing, unsupervised partitioning of data is required to build similar classes or clusters. The idea behind this is to identify each data item unambiguously as a member of particular class or cluster. The issue of partitioning is viewed as an information theoretic problem and it has been shown that the minimization of partitioning entropy may be used to evaluate the most probable set of data items. The data set considered for the simulation are the scanned OMR application forms of the candidates applying in various courses of a University. Classes are defined and inter dependence is measured on the basis of Markov process model and entropy analysis.