期刊名称:International Journal of Advanced Computer Science and Applications(IJACSA)
印刷版ISSN:2158-107X
电子版ISSN:2156-5570
出版年度:2022
卷号:13
期号:1
DOI:10.14569/IJACSA.2022.0130113
语种:English
出版社:Science and Information Society (SAI)
摘要:Malware constitutes a prime exploitation tool to attack the vulnerabilities in software that lead to a threat to security. The number of malware gets generated as exploitation tools need effective methods to detect them. Machine learning methods are effective in detecting malware. The effectiveness of machine learning models can be increased by analyzing how the features that build the model contribute to the detection of malware. The model can be made robust by getting insight into how features contribute to each sample that is fed to a trained model. In this paper, the boosting machine learning model based on LightGBM is enhanced with Shapley value to detect the contribution of the top nine features for classification such as true positive or true negative and for misclassification such as false positive or false negative. This insight in the model can be used for effective and robust malware detection and to avoid wrong detections such as false positive and false negative. The comparison of the top features and their contribution in shapely value for each category of the sample gives insight and inductive learning into the model to know the reasons for misclassification. Inductive learning can be transformed into rules. The prediction by the trained model can be re-evaluated with such inductive learning and rules to ensure effective and robust prediction and avoid misclassification. The performance of models gives 98.48 at maximum and 97.45 at a minimum by 10 fold cross-validation.