首页    期刊浏览 2024年10月05日 星期六
登录注册

文章基本信息

  • 标题:Web Page Classification Using Relational Learning Algorithm and Unlabeled Data
  • 本地全文:下载
  • 作者:Li, Yanjuan ; Guo, Maozu
  • 期刊名称:Journal of Computers
  • 印刷版ISSN:1796-203X
  • 出版年度:2011
  • 卷号:6
  • 期号:3
  • 页码:474-479
  • DOI:10.4304/jcp.6.3.474-479
  • 语种:English
  • 出版社:Academy Publisher
  • 摘要:Applying relational tri-training (R-tri-training for short) to web page classification is investigated in this paper. R-tri-training, as a new relational semi-supervised learning algorithm, is well suitable for learning in web page classification. The semi-supervised component of R-tri-training allows it to exploit unlabeled web pages to enhance the learning performance effectively. In addition, the relational component of R-tri-training is able to describe how the neighboring web pages are related to each other by hyperlinks. Experiments on Web-Kb dataset show that: 1) a large amount of unlabeled web pages (the unlabeled data) can be used by R-tri-training to enhance the performance of the learned hypothesis; 2) the performance of R-tri-training is better than the other algorithms compared with it.
  • 关键词:web page classification;relational tri-training;relational learning;tri-training;co-training
国家哲学社会科学文献中心版权所有