首页    期刊浏览 2025年12月20日 星期六
登录注册

文章基本信息

  • 标题:Automated Business Goal Extraction from E-mail Repositories to Bootstrap Business Understanding
  • 本地全文:下载
  • 作者:Marco Spruit ; Marcin Kais ; Vincent Menger
  • 期刊名称:Future Internet
  • 电子版ISSN:1999-5903
  • 出版年度:2021
  • 卷号:13
  • 期号:10
  • 页码:243
  • DOI:10.3390/fi13100243
  • 语种:English
  • 出版社:MDPI Publishing
  • 摘要:The Cross-Industry Standard Process for Data Mining (CRISP-DM), despite being the most popular data mining process for more than two decades, is known to leave those organizations lacking operational data mining experience puzzled and unable to start their data mining projects. This is especially apparent in the first phase of Business Understanding, at the conclusion of which, the data mining goals of the project at hand should be specified, which arguably requires at least a conceptual understanding of the knowledge discovery process. We propose to bridge this knowledge gap from a Data Science perspective by applying Natural Language Processing techniques (NLP) to the organizations’ e-mail exchange repositories to extract explicitly stated business goals from the conversations, thus bootstrapping the Business Understanding phase of CRISP-DM. Our NLP-Automated Method for Business Understanding (NAMBU) generates a list of business goals which can subsequently be used for further specification of data mining goals. The validation of the results on the basis of comparison to the results of manual business goal extraction from the Enron corpus demonstrates the usefulness of our NAMBU method when applied to large datasets.
国家哲学社会科学文献中心版权所有