首页    期刊浏览 2025年02月26日 星期三
登录注册

文章基本信息

  • 标题:An Efficient Technique for Tag Extraction and Content Retrieval from Web Pages
  • 本地全文:下载
  • 作者:S.Sathya ; Dr. B.Srinivasan
  • 期刊名称:International Journal of Computer Trends and Technology
  • 电子版ISSN:2231-2803
  • 出版年度:2013
  • 卷号:4
  • 期号:9-2
  • 出版社:Seventh Sense Research Group
  • 摘要:In web database contains a large amount of information that is in the form of structured objects which are called as data records. In web databases to automatically extracting data records that are encoded in the query result page. These data records are important because these are present the essential information of their host pages, e.g., lists of products or services. A query result page contains not only the actual data, but also other information, such as navigational panels, advertisements, comments, information about hosting sites. The goal of web database data extraction is to remove any irrelevant information from the query result page, extract the query result records from those page, and align the extracted query result record (QRR) from the page, and align the extracted query result records into a table such that data values belonging to the same attribute are placed into the same table column. The proposed technique is able to handle both the attribute based and content based values are retrieved from the web pages in structured and unstructured data.
  • 关键词:Web data records; data region identification; record alignment; wrapper; information integration
国家哲学社会科学文献中心版权所有