期刊名称:International Journal of Advanced Computer Science and Applications(IJACSA)
印刷版ISSN:2158-107X
电子版ISSN:2156-5570
出版年度:2019
卷号:10
期号:9
页码:561-570
出版社:Science and Information Society (SAI)
摘要:Automation has made it possible to garner and
preserve students’ data and the modern advent in data science
enthusiastically mines this data to predict performance, to the
interest of both tutors and tutees. Academic excellence is a
phenomenon resulting from a complex set of criteria originating
in psychology, habits and according to this study, lifestyle and
preferences–justifying machine learning to be ideal in classifying
academic soundness. In this paper, computer science majors’
data have been gleaned consensually by surveying at Ahsanullah
University, situated in Bangladesh. Visually aided exploratory
analysis revealed interesting propensities as features, whose
significance was further substantiated by statistically inferential
Chi-squared (χ
2
) independence tests and independent samples
t-tests for categorical and continuous variables respectively, on
median/mode-imputed data. The initially relaxed p-value retained
all exploratorily analyzed features, but gradual rigidification
exposed the most powerful features by fitting neural networks
of decreasing complexity i.e., having 24, 20 and finally 12 hidden
neurons. Statistical inference uniquely helped shed off weak
features prior to training, thus optimizing time and generally
large computational power to train expensive predictive models.
The k-fold cross-validated, hyper-parametrically tuned, robust
models performed with average accuracies wavering between
90% to 96% and an average 89.21% F1-score on the optimal
model, with the incremental improvement in models proven by
statistical ANOVA.
关键词:Educational Data Mining (EDM); Exploratory Data
Analysis (EDA); median and mode imputation; inferential statistics;
t-test; Chi-squared independence test; ANOVA-test