期刊名称:International Journal of Computer Science & Technology
印刷版ISSN:2229-4333
电子版ISSN:0976-8491
出版年度:2016
卷号:7
期号:2
页码:123-127
语种:English
出版社:Ayushmaan Technologies
摘要:Gene expression data very often contain missing values. In regards to this, effective missing value estimation methods are needful though many algorithms for gene expression data analysis require a complete matrix of gene array values. In this paper, local least square imputation and weighted k-nearest neighbors(KNN) imputation are proposed to estimate missing values in the gene expression data. The proposed local least squares(LLS) imputation method gives a target gene which has missing values through a linear combination of very similar genes. The similar genes are selected by k-nearest neighbors or k coherent genes that have bigger values of Pearson Correlation coefficients. In our experiments, the proposed KNN imputation and LLS imputation method applied in e-coli bacteria dataset producing the percentages of missing values in the data.