文章基本信息

标题：Comparison of Machine Learning Algorithms for Sentiment Classification on Fake News Detection
本地全文：下载
作者：Yuzi Mahmud ; Noor Sakinah Shaeeali ; Sofianita Mutalib 等
期刊名称：International Journal of Advanced Computer Science and Applications(IJACSA)
印刷版ISSN：2158-107X
电子版ISSN：2156-5570
出版年度：2021
卷号：12
期号：10
DOI：10.14569/IJACSA.2021.0121072
语种：English
出版社：Science and Information Society (SAI)
摘要：With the wide usage of World Wide Web (WWW) and social media platforms, fake news could become rampant among the users. They tend to create and share the news without knowing the authenticity of it. This would become the most critical issues among the societies due to the dissemination of false information. In that regard, fake news needs to be detected as early as possible to avoid negative influences on people who may rely on such information while making important decisions. The aim of this paper is to develop an automation of sentiment classifier model that could help individuals, or readers to understand the sentiment of the fake news immediately. The Cross-Industry Standard Process for Data Mining (CRISP-DM) process model has been applied for the research methodology. The dataset on fake news detection were collected from Kaggle website. The dataset was trained, tested, and validated with cross-validation and sampling methods. Then, comparison model performance using four machine learning algorithms which are Naïve Bayes, Logistic Regression, Support Vector Machine and Random Forest was constructed to investigate which algorithms has the most efficiency towards sentiment text classification performance. A comparison between 1000 and 2500 instances from the fake news dataset was analyzed using 200 and 500 tokens. The result showed that Random Forest (RF) achieved the highest accuracy compared to other machine learning algorithms.
关键词：Data mining; fake news; sentiment classification; supervised machine learning; text mining