期刊名称:International Journal of Electronics Communication and Computer Engineering
印刷版ISSN:2249-071X
电子版ISSN:2278-4209
出版年度:2014
卷号:5
期号:1
页码:216-219
出版社:IJECCE
摘要:Transfer genomic data into uncover the knowledge is a challenging task. Classification of the grass genome sequences is to predict function and structure of sequences. In this paper, an attempt is made to develop a technique that reduces time and space during grass genome sequences clustering. Clustering technique divided the large dataset into logical or notational meaningful segmentation of grass genome sequences into subgroups. The generated motifs and gene classification were subjected to extensive and systematic downstream analysis to obtain biological insights. A leader algorithm is used to generate non-overlap segments (motifs) with local alignment as features of the sequences. Segmentation of sequences generated motifs are submitted for forming cluster using Cosine similarity and Jaccard index measures. The experimental results are compared with leader algorithm using global alignment technique. The time and space requirements are reduced in both training and testing phase without affecting Classification Accuracy.
关键词:Leader Global Alignment LGA; Jaccard Coefficient Similarity Alignment JCA; Cosine Similarity Alignment CSA; Classification Accuracy CA