首页    期刊浏览 2024年11月28日 星期四
登录注册

文章基本信息

  • 标题:APPLICATION OF DOCUMENTS CATEGORIZATION AND SIMILARITY LEVEL MEASUREMENT USING KEYWORDS ON SCIENTIFIC WRITING AT UNIVERSITY OF GUNADARMA
  • 本地全文:下载
  • 作者:Adhit Herwansyah ; Sulistyo Puspitodjati
  • 期刊名称:Faculty of Computer Science and Information Technology
  • 出版年度:2009
  • 卷号:0
  • 期号:0
  • 语种:English
  • 出版社:Faculty of Computer Science and Information Technology
  • 摘要:Keyword: Keywords : Document Categorization, Document Similarity, Text Mining, TF-IDF, Vector Space ModelAbstrack:ABSTRACT :The increasing progress of the Gunadarma University then it is making a lot of scientific writing by students. A scientific writing can be easily categorized manually by humans, but if it is done in a computerized will bring its own problems. Similarly, the level of resemblance or similarity search for a document with other documents, people can easily determine whether a document has the level of resemblance or similarity with other documents or not, for that in this study will be made a tool that can categorize a document and a given level of similarity between documents are computerized. In this study the techniques used to solve the above problems is to use text mining techniques for document categorization of scientific writing. As for the search for similarity value of a document with other documents using keywords categorization results obtained from the document, and the algorithm used is algorithm TF / IDF (Term Frequency - Inversed Document Frequency) and Vector Space Model Algorithm. With this research, it is hope that the document categorization process will be computerized, the result can be in accordance with the result of manual categorization. And measuring the level of similarity of documents was to show how much the value of similarity of documents with other documents.
国家哲学社会科学文献中心版权所有