首页    期刊浏览 2024年07月08日 星期一
登录注册

文章基本信息

  • 标题:An Automated Analytics Engine for College Program Selection using Machine Learning and Big Data Analysis
  • 本地全文:下载
  • 作者:Jinhui Yu ; Xinyu Luan ; Yu Sun
  • 期刊名称:Computer Science & Information Technology
  • 电子版ISSN:2231-5403
  • 出版年度:2021
  • 卷号:11
  • 期号:14
  • 语种:English
  • 出版社:Academy & Industry Research Collaboration Center (AIRCC)
  • 摘要:Because of the differences in the structure and content of each website, it is often difficult for international applicants to obtain the application information of each school in time. They need to spend a lot of time manually collecting and sorting information. Especially when the information of the school may be constantly updated, the information may become very inaccurate for international applicants. we designed a tool including three main steps to solve the problem: crawling links, processing web pages, and building my pages. In compiling languages, we mainly use Python and store the crawled data in JSON format [4]. In the process of crawling links, we mainly used beautiful soup to parse HTML and designed crawler. In this paper, we use Python language to design a system. First, we use the crawler method to fetch all the links related to the admission information on the school's official website. Then we traverse these links, and use the noise_remove [5] method to process their corresponding page contents, so as to further narrow the scope of effective information and save these processed contents in the JSON files. Finally, we use the Flask framework to integrate these contents into my front-end page conveniently and efficiently, so that it has the complete function of integrating and displaying information.
  • 关键词:Data Crawler;Data Processing;Web framework
国家哲学社会科学文献中心版权所有