首页    期刊浏览 2024年11月28日 星期四
登录注册

文章基本信息

  • 标题:Deep learning for early performance prediction of introductory programming students: a comparative and explanatory study
  • 本地全文:下载
  • 作者:Filipe Dwan Pereira ; Samuel C. Fonseca ; Elaine H. T. Oliveira
  • 期刊名称:Revista Brasileira de Informática na Educação
  • 印刷版ISSN:1414-5685
  • 出版年度:2020
  • 卷号:28
  • 页码:723-748
  • DOI:10.5753/rbie.2020.28.0.723
  • 出版社:Revista Brasileira de Informática na Educação
  • 摘要:Introductory programming may be complex for many students. Moreover, there is a high failure and dropout rate in these courses. A potential way to tackle this problem is to predict student performance at an early stage, as it facilitates human-AI collaboration towards prescriptive analytics, where the instructors/monitors will be told how to intervene and support students - where early intervention is crucial. However, the literature states that there is no reliable predictor yet for programming students’ performance, since even large-scale analysis of multiple features have resulted in only limited predictive power. Notice that Deep Learning (DL) can provide high-quality results for huge amount of data and complex problems. In this sense, we employed DL for early prediction of students’ performance using data collected in the very first two weeks from introductory programming courses offered for a total of 2058 students during 6 semesters (longitudinal study). We compared our results with the state-of-the-art, an Evolutionary Algorithm (EA) that automatic creates and optimises machine learning pipelines. Our DL model achieved an average accuracy of 82.5%, which is statistically superior to the model constructed and optimised by the EA (p-value << 0.05 even with Bonferroni correction). In addition, we also adapted the DL model in a stacking ensemble for continuous prediction purposes. As a result, our regression model explained ~62% of the final grade variance. In closing, we also provide results on the interpretation of our regression model to understand the leading factors of success and failure in introductory programming.
  • 其他摘要:Introductory programming may be complex for many students. Moreover, there is a high failure and dropout rate in these courses. A potential way to tackle this problem is to predict student performance at an early stage, as it facilitates human-AI collaboration towards prescriptive analytics, where the instructors/monitors will be told how to intervene and support students - where early intervention is crucial. However, the literature states that there is no reliable predictor yet for programming students’ performance, since even large-scale analysis of multiple features have resulted in only limited predictive power. Notice that Deep Learning (DL) can provide high-quality results for huge amount of data and complex problems. In this sense, we employed DL for early prediction of students’ performance using data collected in the very first two weeks from introductory programming courses offered for a total of 2058 students during 6 semesters (longitudinal study). We compared our results with the state-of-the-art, an Evolutionary Algorithm (EA) that automatic creates and optimises machine learning pipelines. Our DL model achieved an average accuracy of 82.5%, which is statistically superior to the model constructed and optimised by the EA (p-value << 0.05 even with Bonferroni correction). In addition, we also adapted the DL model in a stacking ensemble for continuous prediction purposes. As a result, our regression model explained ~62% of the final grade variance. In closing, we also provide results on the interpretation of our regression model to understand the leading factors of success and failure in introductory programming
  • 关键词:Online judges;Deep Learning;CS1;introductory programming;prediction
  • 其他关键词:Educational Data Mining;Learning Analytics;CS1;Blended Learning;Machine Learning;Interpreting Predictive Model
国家哲学社会科学文献中心版权所有