首页    期刊浏览 2024年12月01日 星期日
登录注册

文章基本信息

  • 标题:Optimization approach of large downloadable data files from SCADA systems.
  • 作者:Boca, Loredana ; Croitoru, Bogdan ; Ileana, Ioan
  • 期刊名称:Annals of DAAAM & Proceedings
  • 印刷版ISSN:1726-9679
  • 出版年度:2009
  • 期号:January
  • 语种:English
  • 出版社:DAAAM International Vienna
  • 摘要:Nowadays, process control performance monitoring software has become an important tool in the control engineer's toolbox. Still, the number of performance tests and statistics that can be calculated for any given control loop can be overwhelming. One concerning problem is the permanently increasing number of data. These data become even more complex to be gathered and the management is also very difficult. Even more difficult is to make results readable and meaningful. For example a SCADA system produce more than 900 million data gathered from less than 100 sensors in a few years (Widom, 2008).
  • 关键词:Data collection;Data entry;Downloading;Mathematical optimization;Optimization theory

Optimization approach of large downloadable data files from SCADA systems.


Boca, Loredana ; Croitoru, Bogdan ; Ileana, Ioan 等


1. INTRODUCTION

Nowadays, process control performance monitoring software has become an important tool in the control engineer's toolbox. Still, the number of performance tests and statistics that can be calculated for any given control loop can be overwhelming. One concerning problem is the permanently increasing number of data. These data become even more complex to be gathered and the management is also very difficult. Even more difficult is to make results readable and meaningful. For example a SCADA system produce more than 900 million data gathered from less than 100 sensors in a few years (Widom, 2008).

The present paper describes the management of SCADA downloadable data and how can be possible to optimize the process of recording data from sensors. We propose an optimization method for data management that gives us the possibility to monitor the evolution of a sensor for a long period of time. Every sensor from the SCADA system is measuring different environmental parameters and the recorded values are stored in CSV or Excel files. These kinds of files are difficult to manage because each file contains millions of data (usually around 1.5 -3 millions of values). Also, each file contains data from many sensors (typically up to 40 sensors). Sampling rate is the average value of 1000 readings/ min (0.06 Hz), but we have some situations where 10 or 100 Hz sampling rates are requested. According to this situation, in a month a sensor has stored (produced) ~44000 records. The average number of values in every file is ~1.5 million, resulting in a very large CSV file (~7 MB) (McGowan, 2008).

To control the evolution process of one sensor on a period of time is very difficult, because we need a graphic tool for this. Each file contains sensor data for a month, but a sensor can be found in multiple Excel files, so, it's even more difficult to monitor the evolution of that sensor during months of working. In our case, each Excel file consists in: a header containing sensor name, engineering units, description, status, data type, size (bytes/value), number of values; and the recorded sensors values, date, time and status for each value (Fig. 1).

We have found a solution for data evolution management and for monitoring sensors behavior. We chose to work with Access Databases, because we considered that querying parameters are powerful tools for extracting relevant data of any sensor and data monitoring.

[FIGURE 1 OMITTED]

The Excel files were converted in Access Databases following some criteria. Because we are dealing with two types of Excel files, we are forced to use different criteria for each type. MDB files offer the possibility to use queries for extracting certain details that we need (Ye & Heidemann, 2006).

2. RELATED WORK

Three study cases (Fernandez, 1994; Heidemann & Ye, 2006; Ye & Heidemann, 2006), tried to solve, using different approaches, some S.C.A.D.A problems regarding the management of large data files, data which had been gathered through telemetry for system evaluation and maintenance management. To handle the large volume of data collection, evaluation, and report, they have expanded its existing software as part of an S.C.A.D.A Wide Area Network that is operating on several mini computers Module. This provides capability of quickly access in semi real-time data mode.

Another related research describes the openness and support for disclosure of future S.C.A.D.A data used to improve safety, reliability, and security for networked embedded control of physical systems. Open protocols, with new approaches to access control and in-network processing are needed to make greater openness and viable data sharing (Widom, 2008; Youqiang, 2009).

3. IMPLEMENTATION

We have designed a test panel using Excel files and Access Database files for measuring computing resources, because we want to understand the need of building a high level application, which solves the problem of computing resource and large data management (Tab. 1.).

We have implemented two VB.NET applications that convert Excel files into MDB files for solving large data files management.

The first application (Tag Converter--Fig. 2) is creating, as a first step, four working directories. The internal activity of the application's algorithms is based on full paths of the files involved in the conversion process (Fig. 2).

Non_Processed_Excel_Files directory will contain the non-processed large data Excel files ready to be converted.

Processed_Excel_Files directory will contain Excel files that have been processed. Every processed Excel file is moved and renamed from Non_Processed_Excel_Files directory into Processed_Excel_Files directory. The renaming files process is made by extracting the first data (year and month) of the first record from every Excel file.

Archive_MDB directory contains a list of full paths of Excel files that will be converted into MDB files.

Database_MDB directory contains the final MDB files as a result of Excel files conversion process. The conversion process ratio is 1:1, which means one MDB file is corresponding to one Excel file. Each MDB file contains five fields of data corresponding to: tag name (or sensor name), description of the sensor, data, value and status.

The conversion process takes at least one hour and a half to be performed, but the advantage is the possibility to apply querying parameters to MDB files to analyze the sensor evolution. A 70 MB Excel file containing ~4 million of records was converted in ~2 hours.

The second VB.NET application is Tag_Graph (Fig. 3). This application is loading an MDB database, previously created by Tag_Converter (Fig. 2) application. By applying a complex querying algorithm to database, the application displays a graph of sensor values on a specified period of time (Fig. 4) for observing the sensor evolution. The complex query consists in: starting date, ending date, starting time, ending time, and tag name by description.

[FIGURE 2 OMITTED]

[FIGURE 3 OMITTED]

4. CONCLUSIONS

One concern of SCADA systems is the management of the millions of gathered data. We fulfilled the main goal of the paper, which is an approach in optimization and management of downloaded SCADA sensor data, by implementing, in a reliable mode, two software applications that helps to optimize the work with large data files. Also, with the designed management system of data we are able to easily monitor the evolution of a specified sensor on a period of time using a complex graphic.

There are also disadvantages: converting data from Excel files to MDB files is taking lot of time and powerful hardware resources are needed; the conversion and querying operations are a hardware resources consuming. We propose to be used parallel or multi-server computation to reduce the processing time and speed up files conversion and querying processes.

As a future work we propose two goals: one is to optimize the number, efficiency and relevance of SCADA gathered data by using new sensors sampling rates based on Nyquist principles, and the second goal is the implementation of a software application for parallel computation (multi server) to reduce data processing time.

5. REFERENCES

Fernandez, R. B. (1994). SCADA data integration with facilities management in Miami-Dade, Available from: http://libraries.maine.edu/Spatial/gisweb/spatdb/amfm/am9 4015.html Accessed: 2009-04-25

Heidemann, J. & Ye, W. (2006). Towards Full-disclosure: Broadening Access to SCADA Data to Improve Safety, Reliability, and Security, Available from: http://www.isi.edu/~johnh/PAPERS/Heidemann06e.pdf Accessed: 2009-03-14

McGowan R. (2008). AMPLA--Turning SCADA Data into Information, Goulburn Valley Water, Available from: http://www.wioa.org.au/conference_papers/08_vic/ documents/RyanMcGowan.pdf Accessed: 2009-05-20

Widom, J. (2008). TRIO--A system for data, uncertainty, and lineage, Available from: http://ilpubs.stanford.edu:8090/843/1/2008-27.pdf Accessed: 2009-05-10

Ye, W. & Heidemann, J. (2006). Enabling Interoperability and Extensibility of Future SCADA Systems, Available from: http://www.isi.edu/~johnh/PAPERS/Ye06c.pdf Accessed: 2009-04-28

Youqiang Guo, Zijun Zhang, Xuezhu Pei, A Research on Instability of Small Flow in SCADA and an Optimizing Design for Control, In: Computational Science and Its Applications--ICCSA 2009, Gervasi, O., 653-663, Springer Berlin / Heidelberg, ISBN 978-3-642-02456-6, Berlin
Tab. 1. Allocated Memory & Required CPU usage regarding
different size of Excel and MDB files

Sensor DATA Allocated CPU Usage
Capacity [kBytes] Memory (KB)

200kB Excel 28.460 40%
Application

2 MB Excel 33.072 43%
Application

200MB Excel 375.845 99%
Application

200kB MDB File 10.520 13%

2 MB MDB File 11.080 14%

200 MB MDB File 11.112 19%
联系我们|关于我们|网站声明
国家哲学社会科学文献中心版权所有