期刊名称:Journal of Theoretical and Applied Information Technology
印刷版ISSN:1992-8645
电子版ISSN:1817-3195
出版年度:2012
卷号:43
期号:1
页码:094-102
出版社:Journal of Theoretical and Applied
摘要:The development of microarray technology produces massive gene expression data sets. A major task for the experimentalist is to understand the structure in the huge data sets. Data generated by a scientific experiment always contain random noise. The situation is worst in the area of biology. Statistical methods must be used to accurately interpret large-scale experimental data. Microarray data is represented as a matrix with rows representing individual genes and columns representing conditions or experiments and very heterogeneous in nature. Because many proteins have unknown functions, and because many genes are active all the time in all kinds of cells, researchers usually use microarrays to make comparisons between similar cell types. As a result, we need to develop our ability to ��see�� the information in the massive tables of quantitative measurements that these approaches produce. This research addresses a suitable Micro array data clustering Algorithm which can be used to rearrange the gene expression profiles of microarray data for easy observation and knowledge discovery. In this research, a Lorenz Information Measure(LIM) based algorithm will be used to order the microarray data and after ordering the data, Principal Component Analysis(PCA) will be used to find the principal components in that order. Then the data will be clustered using a special kind of neural network called Self Organizing Maps (SOM).The Microarray data displayed after grouping will have some significance. After the clustering, we can see that the genes with similar expression patterns are grouped together under the related set of conditions. The Implementation of the proposed model will be done using Mat lab 6.5 under Windows operating system. The Performance of the system will be tested and evaluated with suitable gene expression data available for such kind of research.
关键词:Microarray; Lorenz Information Measure (LIM); Principal Component Analysis (PCA) ; Self Organizing Maps (SOM)