首页    期刊浏览 2024年10月05日 星期六
登录注册

文章基本信息

  • 标题:Enactment of Medium and Small Scale Enterprise ETL(MaSSEETL)-an Open Source Tool
  • 本地全文:下载
  • 作者:Rupali Gill ; Jaiteg Singh
  • 期刊名称:International Journal of Computer Science and Information Technologies
  • 电子版ISSN:0975-9646
  • 出版年度:2015
  • 卷号:6
  • 期号:1
  • 页码:141-147
  • 出版社:TechScience Publications
  • 摘要:Data quality is major concern area in an Data Warehouse environment. ETL tools focus on detection and correction of data quality problems that affect the success of a data warehouse. Data imported from source into the data warehouse often has different quality, format, coding etc. In order to bring all the data together in a standard, homogeneous environment, Extraction–transformation– loading (ETL) tools are used. Proprietary tools used for data cleaning have a very limited functionality. Small and Medium Scale Enterprises(SME) and Small Scale Enterprises (SSE) cannot afford the licensing cost of these paid tools. The solution to data quality problems is provided by open source data quality tool - MaSSEETL is to deal with naming conflicts, structural conflicts, date conversions, missing values and changing dimensions. This tool solves the integrity issues faced by various available GPL tools. MaSSEETL solves the appropriate errors with appropriate level of warning. In this paper, we are presenting the implementation of MaSSEETL. The tool provides an increased ease of use in a data warehouse environment. General Terms -Data warehousing, data cleansing, quality data, dirty data, surrogate keys
  • 关键词:Data inconsistency; identification of errors;organization growth; ETL; data quality
国家哲学社会科学文献中心版权所有