文章基本信息

标题：Extraction of Query Interfaces for Domain-Specific Hidden Web Crawler
本地全文：下载
作者：Nupur Gupta
期刊名称：International Journal of Computer Science and Network Security
印刷版ISSN：1738-7906
出版年度：2016
卷号：16
期号：2
页码：124-127
出版社：International Journal of Computer Science and Network Security
摘要：Web databases are now permeative. Such a database can be retrieved via its query interface (only HTML query forms).Extracting HTML query forms is a major task in Deep Web. This task can be accomplished by two methods: a) Positioned HTML forms on the web. b) Recognizing domain-specific forms. For positioning query forms (HTML forms) use HTML tags on the PIW (Publicly Indexable Web).Recognizing of query forms is essential because many of the forms are not the query forms. Non-query forms are used for access of data and data collection. This paper presents a novel approach for extracting web query interfaces using the query condition rules. Query conditions rules form by group label and form element in a query form. I have implemented the proposed novel approach in this paper
关键词：Hidden Web database; query form extraction; domain-specific search.