首页    期刊浏览 2024年11月24日 星期日
登录注册

文章基本信息

  • 标题:Research on discovering deep web entries
  • 本地全文:下载
  • 作者:Wang Ying ; Li Huilai ; Zuo Wanli
  • 期刊名称:Computer Science and Information Systems
  • 印刷版ISSN:1820-0214
  • 电子版ISSN:2406-1018
  • 出版年度:2011
  • 卷号:8
  • 期号:3
  • 页码:779-799
  • DOI:10.2298/CSIS100322028W
  • 出版社:ComSIS Consortium
  • 摘要:

    Ontology plays an important role in locating Domain-Specific Deep Web contents, therefore, this paper presents a novel framework WFF for efficiently locating Domain-Specific Deep Web databases based on focused crawling and ontology by constructing Web Page Classifier(WPC), Form Structure Classifier(FSC) and Form Content Classifier(FCC) in a hierarchical fashion. Firstly, WPC discovers potentially interesting pages based on ontology-assisted focused crawler. Then, FSC analyzes the interesting pages and determines whether these pages subsume searchable forms based on structural characteristics. Lastly, FCC identifies searchable forms that belong to a given domain in the semantic level, and stores these URLs of Domain- Specific searchable forms to a database. Through a detailed experimental evaluation, WFF framework not only simplifies discovering process, but also effectively determines Domain-Specific databases.

  • 关键词:Deep Web; ontology; WPC; FSC; FCC
国家哲学社会科学文献中心版权所有