首页    期刊浏览 2024年10月04日 星期五
登录注册

文章基本信息

  • 标题:Hadoop as a Service: Integration of a Company’s Heterogeneous Data to a Remote Hadoop Infrastructure
  • 本地全文:下载
  • 作者:Yordan Kalmukov ; Milko Marinov
  • 期刊名称:International Journal of Advanced Computer Science and Applications(IJACSA)
  • 印刷版ISSN:2158-107X
  • 电子版ISSN:2156-5570
  • 出版年度:2022
  • 卷号:13
  • 期号:4
  • DOI:10.14569/IJACSA.2022.0130406
  • 语种:English
  • 出版社:Science and Information Society (SAI)
  • 摘要:Data analysis is very important for the development of any business today. It helps to identify organizational bottlenecks, optimize business processes, foresee customers’ demands and behavior, and provides summarized data that could help reducing costs and increase profits. Having this information when designing new products or services highly increases the chances of their success, and thus provides an additional competitive advantage over other businesses. However, having a single data analyst with a computer is far from enough in the era of big data. There are powerful data analytical software tools, but they are either expensive or hard to deploy and require multiple high-performance servers to run. Buying expensive hardware and software, and hiring high-qualified IT experts, is not affordable for all companies, especially for smaller ones and start-ups. Therefore, this article proposes an architecture for integration of a company’s heterogeneous data (stored within a database of any type, or in the file system) to a remote Hadoop cluster, providing powerful data analytical services on demand. This is an affordable and cost-effective cloud-based solution, suitable for a company of any size. Businesses are not required to by any hardware or software, but use the data analytical services on demand, paying a small processing fee per request or by subscription.
  • 关键词:Hadoop integration; data analytical tools; heterogeneous data integration; Hadoop distributed file system (HDFS); HBase; hive
国家哲学社会科学文献中心版权所有