期刊名称:International Journal of Software Engineering and Its Applications
印刷版ISSN:1738-9984
出版年度:2015
卷号:9
期号:2
页码:201-210
DOI:10.14257/ijseia.2015.9.2.17
出版社:SERSC
摘要:Typically, the previous load balancing methods for Flume which completely depends on the user-specified threshold does not adaptively deal with the performance change of the entire log processing system at runtime. Furthermore, their task-transferring algorithm aggravates the performance degradation of the overloaded node because the excessive data transfer to another node should be done on the overloaded node. In this paper, we propose a new load balancing method for Apache Flume by automatically and dynamically modifying threshold of node load status in accordance with the runtime performance of the system. This feature can be realized by monitoring both the increasing rate of incoming log information in the queue of each collector agent and its occupancy rate at the request of the overloaded or under-loaded collection node in a decentralized manner. The proposed method considerably alleviates the additional overhead incurred by the task migration and makes the load of the entire system as fair as possible by selecting the optimal task migration destination depending on the current load-state values of collector agents unlike the previous round-robin and random ones.
关键词:Data intensive processing; Data collection; Apache Flume; Agent; Load balancing