首页    期刊浏览 2024年07月23日 星期二
登录注册

文章基本信息

  • 标题:Architecting an Enterprise Data Lake, A Covid19 Case Study
  • 本地全文:下载
  • 作者:Bushra ; Mohsin Ali Memon ; Salahuddin Saddar
  • 期刊名称:Journal of Software
  • 印刷版ISSN:1796-217X
  • 出版年度:2021
  • 卷号:16
  • 期号:4
  • 页码:174-181
  • DOI:10.17706/jsw.16.4.174-181
  • 语种:English
  • 出版社:Academy Publisher
  • 摘要:Data is increasing at an enormous rate every day. Traditionally data has resided in silosacross any organization,so it’s difficult to have a complete picture for data driven business decision making. Data lake addresses the problem of rate of increase of data by providing “schema on read”, better integration and cheaper storage. It also solves the data silos problemby providing a central platform for a variety of data housing needs. However, implementing a data lake becomes challenging as the implementation needs to address the additional needs like metadata management, data discovery, data governance, data lifecycle management, security and centralized access controls mechanisms. This paper intends to provide a comprehensive architecture of data lake to address these challenges. We have also conducted and documented our experiments with publicly available datasets about COVID19 to validate the design and applicability of the proposed architecture for business analytics purposes.
  • 关键词:Big data; data lake; data governance; data lake management; serverless architecture.
国家哲学社会科学文献中心版权所有