首页    期刊浏览 2025年06月17日 星期二
登录注册

文章基本信息

  • 标题:A Methodology for Reliable Code Plagiarism Detection Using Complete and Language Agnostic Code Clone Classification
  • 本地全文:下载
  • 作者:Sanjay B. Ankali ; Latha Parthiban
  • 期刊名称:International Journal of Modern Education and Computer Science
  • 印刷版ISSN:2075-0161
  • 电子版ISSN:2075-017X
  • 出版年度:2021
  • 卷号:13
  • 期号:3
  • 页码:34-56
  • DOI:10.5815/ijmecs.2021.03.04
  • 出版社:MECS Publisher
  • 摘要:Code clone detection plays a vital role in both industry and academia. Last three decades have seen more than 250 clone detection techniques with lack of single framework that can detect and classify all 4 basic types of code clones with high precision. This serious lack of clone classification impacts largely on the universities and online learning platforms that fail to validate the projects or coding assignments submitted online. In this paper, we propose a complete and language agnostic technique to detect and classify all 4 clone types of C, C , and Java programs. The method first generates the parse tree then extracts the functional tree to eliminate the need for the preprocessing stage employed by previous clone detection techniques. The generated parse tree contains all the necessary information for detecting code clones. We employ TF-IDF cosine similarity for the proper classification of clone types. The proposed technique achieves incredible precision rate of 100% in detecting the first two types of clones and 98% precision in detecting type-3 and type-4 clones for small codes of C, C , and Java containing an average line count of 5. The proposed technique outperforms the existing tree-based clone detection tools by providing the average precision of 98.07% on the C, C , and Java programs crawled from Github with an average line count of 15 which signifies that cosine similarity measure on ANTLR functional tree accurately detects all 4 types of small clones and act as proper validation tools for identifying the learning level in the submitted programming assignment.
  • 关键词:Clone types; functional tree; TF-IDF; cosine similarity; Code plagiarism.
国家哲学社会科学文献中心版权所有