首页    期刊浏览 2025年04月28日 星期一
登录注册

文章基本信息

  • 标题:Extraction of Motif Patterns from Protein Sequences using SVD with Rough K-Means Algorithm
  • 本地全文:下载
  • 作者:E.Elayaraja ; K.Thangavel ; Chitralegha
  • 期刊名称:International Journal of Computer Science Issues
  • 印刷版ISSN:1694-0784
  • 电子版ISSN:1694-0814
  • 出版年度:2012
  • 卷号:9
  • 期号:6
  • 出版社:IJCSI Press
  • 摘要:Discovering protein sequence motif information is one of the most crucial tasks in bioinformatics research. In this work, we try to obtain protein recurring patterns which are universally conserved across protein family boundaries. In order to generate higher quality protein sequence motif information from Protein Sequence Culling Server (PISCES) dataset, we tried several different advanced clustering algorithms, such as hierarchical clustering, Self-Organizing Maps (SOM) etc. However, since the dataset itself contains more than 6, 60,000 segments where each segment contains 180 dimensions, any clustering algorithm required more than O(n) complexity is not applicable. Therefore, the very first step of our research is trying to reduce segments. The results suggest that the Singular Value Decomposition (SVD) computing technique is more suits for reducing segments. After that the reduced segments are followed by applying Rough K-Means clustering algorithm. Our experiments indicate that the Rough K-Means algorithm satisfactorily increases the percentage of sequence segments belonging to clusters with high structural similarity than K-Means. The experimental results suggest that the SVD with Rough K-Means algorithm may be applied to other areas of bioinformatics research in order to explore the underlying relationships between data samples more effectively.
  • 关键词:Clustering; Motif; Protein Sequence; SVD; HSSP; DSSP; HSSP;BLOSUM62.
国家哲学社会科学文献中心版权所有