出版社:Academy & Industry Research Collaboration Center (AIRCC)
摘要:Twitter is a popular microblogging service where users create status messages (called搕weets?. These tweets sometimes express opinions about different topics; and are presented tothe user in a chronological order. This format of presentation is useful to the user since thelatest tweets from are rich on recent news which is generally more interesting than tweets aboutan event that occurred long time back. Merely, presenting tweets in a chronological order maybe too embarrassing to the user, especially if he has many followers. Therefore, there is a needto separate the tweets into different categories and then present the categories to the user.Nowadays Text Categorization (TC) becomes more significant especially for the Arabiclanguage which is one of the most complex languages.In this paper, in order to improve the accuracy of tweets categorization a system based onRough Set Theory is proposed for enrichment the document抯 representation. The effectivenessof our system was evaluated and compared in term of the F-measure of the Na飗e Bayesianclassifier and the Support Vector Machine classifier.
关键词:Arabic Language; Text Categorization; Rough Set Theory; Twitter; Tweets.