首页    期刊浏览 2024年08月24日 星期六
登录注册

文章基本信息

  • 标题:Extracting High-Level Concepts from Open-Source Systems
  • 本地全文:下载
  • 作者:Mamdouh Alenezi
  • 期刊名称:International Journal of Software Engineering and Its Applications
  • 印刷版ISSN:1738-9984
  • 出版年度:2015
  • 卷号:9
  • 期号:1
  • 页码:183-190
  • DOI:10.14257/ijseia.2015.9.1.16
  • 出版社:SERSC
  • 摘要:Analyzing the unstructured information in the source code (that is, the comments and identifiers) is based on the idea that the unstructured information reveals, to some extent, the concepts of the problem domain of the software. This information adds a new layer of source code semantic information and captures the domain semantics of the software. Developers use identifiers, method names, and comments to incorporate components of the solution domain of the software. Topic models reveal topics from the corpus, which embody real world concepts by analyzing words that frequently co-occur. These topics have been found to be effective mechanisms for describing the major themes spanning a corpus. Recently, software engineering researchers established that topic models can be effective in structuring various software artifacts, such as bug reports and requirements documents. In this paper, we extract topic models from the textual content of source code by conducting a case study on the source code of Java-based open-source systems, ArgoUML, Checkstyle, JHotDraw and jEdit. The paper investigates the effectiveness of LDA in comprehending large open-source software systems.
  • 关键词:Open source; Source code; LDA; Topic Extraction
国家哲学社会科学文献中心版权所有