期刊名称:American Journal of Economics and Business Administration
印刷版ISSN:1945-5488
电子版ISSN:1945-5496
出版年度:2011
卷号:3
期号:1
页码:213-218
DOI:10.3844/ajebasp.2011.213.218
出版社:Science Publications
摘要:Problem statement: With the rapid development of World Wide Web (WWW), a huge amount of information is now accessible to the web users. This phenomenon has attracted academic users to publish their research papers online, at the same time downloading and sharing academic papers among them through WWW. Categorizing a document manually can take up considerable amount of user’s time whereby user will have to read each of the documents to decide which category it is suitable. Approach: Our research study proposes the use of set of terms stored in a database to categorize computer science papers. The categorizer agent focuses on categorizing the text document into predetermined categories based on the extracted keyword. Results: We have evaluated our document categorizer agent on a number of computer science papers. The categorization process is done by parsing the document, calculating the frequency of each term and matching the terms found in the database. Conclusion: The Categorizer Agent proposed in this research paper is evaluated as a good approach to categorize electronic papers. Moreover, the results indicated that the use of this term database is a sustainable way to categorize computer science electronic documents.
关键词:Artificial intelligence; information retrieval; document categorization; data mining; Support Vector Machine (SVM); Artificial Neural Networks (ANN); Generalized Instance Set (GIS); Self Organizing Map (SOM)