首页    期刊浏览 2025年06月11日 星期三
登录注册

文章基本信息

  • 标题:THE DEVELOPMENT OF BAHASA INDONESIA CORPORA FOR MACHINE LEARNING MODEL IN COMBATING CYBER BULLYING: A CASE STUDY OF THE INDONESIAN 2017 CAPITAL CITY GOVERNOR ELECTION
  • 本地全文:下载
  • 作者:MUHAMMAD IKHWAN JAMBAK ; PUTRI SANGGABUANA SETIAWAN
  • 期刊名称:Journal of Theoretical and Applied Information Technology
  • 印刷版ISSN:1992-8645
  • 电子版ISSN:1817-3195
  • 出版年度:2018
  • 卷号:96
  • 期号:7
  • 出版社:Journal of Theoretical and Applied
  • 摘要:The cyber bullying is considered as a serious problem and it is proven that the impact obtained from cyber bullying is larger than the suppression of physical in traditional bullying. One way to reduce the cyber bullying is doing an earlier detection on it. Therefore, each of uploaded bullying content could be removed by an application of concerned. However, it is not an easy task to do this detection because of several reasons. First of all, the language form of the cyber bullying constructed is usually not in accordance with the formal language structure. As the result, the assessment of content cannot be processed per word individually, but in the contrary, it should include the whole sentence, including the punctuation marks, emoticons, and tagging. In other words, ones cannot rely on the rough/dirty word filtering in combating the cyber bullying, but rather have a tool that understands the bullying context within a sentence. Secondly, every language is different in how it is used in expressing its user mood. Thus, although there are reports on how machine learning successfully used in certain levels in combating cyber bullying in English, it does not mean that it can be used in other languages. In fact, the machine learning algorithm should work on the top of large and structured sets of texts of a certain language that can be stored and processed electronically called corpora. The ultimate goal is to propose a machine learning algorithm for combating cyber bullying in Bahasa Indonesia. Bahasa Indonesia corpora have been developed and will be tested using the existing machine learning algorithms that work in Indonesia, English, and Hindi. The data had been scraped and derived from social media during the 2017 Indonesian capital city governor election.
  • 关键词:Machine Learning; Cyber Bullying; Social Media; Corpora; Feature Space Design; Classification
国家哲学社会科学文献中心版权所有