首页    期刊浏览 2024年11月30日 星期六
登录注册

文章基本信息

  • 标题:Classification of Web Search Results Using Modified Naïve Bayesian Approach
  • 本地全文:下载
  • 作者:Bhagyesh P. Asatkar ; Prof. K. P. Wagh ; Dr. P.N. Chatur
  • 期刊名称:International Journal of Innovative Research in Computer and Communication Engineering
  • 印刷版ISSN:2320-9798
  • 电子版ISSN:2320-9801
  • 出版年度:2017
  • 卷号:5
  • 期号:5
  • 页码:9785
  • DOI:10.15680/IJIRCCE.2017.0505184
  • 出版社:S&S Publications
  • 摘要:The primary purpose of data mining is to extract information from huge amounts of raw data. To get theuseful data from large amount of available data is necessary. Web document classification includes the classification ofweb snippets into different categories based on their content. The classes are predefined in which the pages areclassified. The web snippets from first three pages of Google extracted and prepossessed. Preprocessing includestokenisation, reduction of redundant and irrelevant data. After the prepossessing of the web snippets, Modified NaïveBayesian approach is used to get the snippets classified into predefined categories. From these the probability of eachword will be calculated and page will be classified into its predefined class based on the highest posterior probabilitycalculated. The Modified Naive Bayes classifier is used to calculate the probability of each word with respect to eachclass. By using snippets as a input we managed to reduce the require classification time up to 49.04 %, shows the Fmeasurevalue 93.79 % and achieved accuracy up to 96.01 %. An analysis of the system reveals that the snippetsclassification system works well even when the number of snippets is increased.
  • 关键词:Modified Naïve Bayesian Classifier; Quick Reduct Algorithm; Tokenization; F-measure
国家哲学社会科学文献中心版权所有