首页    期刊浏览 2025年09月17日 星期三
登录注册

文章基本信息

  • 标题:A Survey on HTML Structure Aware and Tree Based Web Data Scraping Technique
  • 本地全文:下载
  • 作者:Vinayak B. Kadam ; Ganesh K. Pakle
  • 期刊名称:International Journal of Computer Science and Information Technologies
  • 电子版ISSN:0975-9646
  • 出版年度:2014
  • 卷号:5
  • 期号:2
  • 页码:1655-1658
  • 出版社:TechScience Publications
  • 摘要:Vast amount of information is available on web. Data analysis applications such as extracting mutual funds information from a website, daily extracting opening and closing price of stock from a web page involves web data extraction. Huge efforts are made by lots of researchers to automate the process of web data scraping. Lots of techniques depends on the structure of web page i.e. html structure or DOM tree structure to scrap data from web page. In this paper we are presenting survey of HTML aware web scrapping techniques
  • 关键词:DOM Tree; HTML structure; semi structured;web pages; web scrapping and Web data extraction
国家哲学社会科学文献中心版权所有