期刊名称:International Journal of Computer Science and Information Technologies
电子版ISSN:0975-9646
出版年度:2011
卷号:2
期号:2
页码:614-620
出版社:TechScience Publications
摘要:DNA microarrays have gained widespread uses in biological studies such as cancer classification, cancer prognosis and identifications of cell cycle-regulated genes of yeast because of their large number of genes and small size. But they often produce missing expression values due to various reasons which significantly affect the performance of any data analysis. One primary concern of classifier learning is prediction accuracy.Presence of incomplete information significantly effect the performance and accuracy of a classifier.Hence prior to the classification a complete matrix is needed for which in the pre processing step the missing value should be estimated(imputed).This survey paper proposes different existing estimation methods including KNNimpute, SVDimpute, LSimpute, LLSimpute, IFRAA, Principal curve etc for missing values with the description of the basic principles behind the different imputation approaches, also the review tries to provide the performance of each method on the basis of different datasets used and future direction for the research.
关键词:Missing value imputation; gene classification;gene expression data