首页    期刊浏览 2024年10月06日 星期日
登录注册

文章基本信息

  • 标题:A HVS MODEL FOR REPRESENTATION OF DOMAIN-ORIENTED WEB PAGE TOPIC FEATURES
  • 本地全文:下载
  • 作者:XIANGHUA WU ; QIAO GUO ; LEI LA
  • 期刊名称:Journal of Theoretical and Applied Information Technology
  • 印刷版ISSN:1992-8645
  • 电子版ISSN:1817-3195
  • 出版年度:2012
  • 卷号:45
  • 期号:2
  • 页码:710-715
  • 出版社:Journal of Theoretical and Applied
  • 摘要:Domain-oriented web page extraction is a new and practical direction in the field of information extraction. The paper focuses on the representation of domain-oriented web page topic features, and hierarchical vector space (HVS) model is put forward. Considering the hierarchical characteristics of the web page itself, topic features of the web page are expressed more effectively by HVS model from the facets of the page structure and the content. Then the topic-related page identification problem is discussed by the similarity calculation. Experimental results show good accuracy and applicability for our system to domain-oriented web extraction.
  • 关键词:Domain-Oriented; Hierarchical Vector Space Model; Information Extraction
国家哲学社会科学文献中心版权所有