首页    期刊浏览 2025年03月12日 星期三
登录注册

文章基本信息

  • 标题:Survey on Web Page Noise Cleaning for Web Mining
  • 本地全文:下载
  • 作者:S. S. Bhamare ; Dr. B. V. Pawar
  • 期刊名称:International Journal of Computer Science and Information Technologies
  • 电子版ISSN:0975-9646
  • 出版年度:2013
  • 卷号:4
  • 期号:6
  • 页码:766-770
  • 出版社:TechScience Publications
  • 摘要:Web Page Noise Cleaning is one of the new research area of study for removing the noise patterns of web pages for effective web mining. The World Wide Web contains large amount of web pages which are accessible by users. With conventional data or text, Web pages generally contain a large amount of noise information that is not part of the main contents of the web pages, e.g., advertisement banners, navigation bars, and disclaimer/copyright notices. The main objective of this area is removing such irrelevant information (i.e. Web Page Noise or Local Noise) in Web pages that can seriously harm Web mining task such as clustering and classification etc. The main purpose of this paper is to review and discuss the major research work that has been done in this area and identifying the challenges and issues in this area.
  • 关键词:WWW; Web Page Cleaning; Noise Block; DOM;Tree; Web Mining; Web pages.
国家哲学社会科学文献中心版权所有