首页    期刊浏览 2024年11月24日 星期日
登录注册

文章基本信息

  • 标题:Cross-Language Source Code Re-Use Detection Using Latent Semantic Analysis
  • 本地全文:下载
  • 作者:Enrique Flores ; Alberto Barrón-Cedeño ; Lidia Moreno
  • 期刊名称:Journal of Universal Computer Science
  • 印刷版ISSN:0948-6968
  • 出版年度:2015
  • 卷号:21
  • 期号:13
  • 页码:1708-1725
  • 出版社:Graz University of Technology and Know-Center
  • 摘要:Nowadays, Internet is the main source to get information from blogs, encyclopedias, discussion forums, source code repositories, and more resources which are available just one click away. The temptation to re-use these materials is very high. Even source codes are easily available through a simple search on the Web. There is a need of detecting potential instances of source code re-use. Source code re-use detection has usually been approached comparing source codes in their compiled version. When dealing with cross-language source code re-use, traditional approaches can deal only with the programming languages supported by the compiler. We assume that a source code is a piece of text ,with its syntax and structure, so we aim at applying models for free text re-use detection to source code. In this paper we compare a Latent Semantic Analysis (LSA) approach with previously used text re-use detection models for measuring cross-language similarity in source code. The LSA-based approach shows slightly better results than the other models, being able to distinguish between re-used and related source codes with a high performance.
国家哲学社会科学文献中心版权所有