期刊名称:International Journal of Computer Science and Network
印刷版ISSN:2277-5420
出版年度:2016
卷号:5
期号:2
页码:271-275
出版社:IJCSN publisher
摘要:Web scraping is a software technique ofextracting information from websites. This type of softwareprograms simulates human browsing or exploration of theWorld Wide Web by the means of applying low-levelHypertext Transfer Protocol (HTTP). Web scraping is atechnique of collecting information from WWW using a WebCrawler and it is a common technique used by mostApplication Programming Interface (API’s). This paper isused to fetch the recent publications and patents related tothe pharmaceutical industries by developing an efficient webcrawler which will fetch all possible information onpublications and patents from different pharmaceuticalindustry websites based on their recent news, meetings andinnovations.
关键词:Web Crawler; Web Scraping; Hypertext Transfer;Protocol.