期刊名称:International Journal of Computer Science, Engineering and Applications (IJCSEA)
印刷版ISSN:2231-0088
电子版ISSN:2230-9616
出版年度:2012
卷号:2
期号:1
出版社:Academy & Industry Research Collaboration Center (AIRCC)
摘要:Feature selection is an effective method used in text categorization for sorting a set of documents into certain number of predefined categories. It is an important method for improving the efficiency and accuracy of text categorization algorithms by removing irredundant terms from the corpus. Genome contains the total amount of genetic information in the chromosomes of an organism, including its genes and DNA sequences. In this paper a Clustering technique called Hierarchical Techniques is used to categories the Features from the Genome documents. A framework is proposed for Genomic Feature set Selection. A Filter based Feature Selection Method like 2 statistics, CHIR statistics are used to select the Feature set. The Selected Feature set is verified by using F-measure and it is biologically validated for Biological relevance using the BLAST tool.