期刊名称:iSys - Revista Brasileira de Sistemas de Informação
印刷版ISSN:1984-2902
出版年度:2018
卷号:11
期号:1
页码:103-132
语种:Portuguese
出版社:iSys - Revista Brasileira de Sistemas de Informação
摘要:Spam filtering in online instant messages and SMS is a challenging problem nowadays. It is because the messages are often very short and rife with slangs, idioms, symbols, emoticons, and abbreviations which hamper predicting and knowledge discovering. In order to face this problem, we evaluated a simple, fast, scalable, multiclass, and online text classification method based on the minimum description length principle. We conducted experiments using a real and public dataset, which demonstrate that our method is effective on instant messaging and SMS spam filtering in both online and offline learning contexts.
关键词:Aprendizado online;Navalha de Occam;Categorização de texto;Aprendizado de máquina