期刊名称:International Journal of Advanced Computer Science and Applications(IJACSA)
印刷版ISSN:2158-107X
电子版ISSN:2156-5570
出版年度:2020
卷号:11
期号:9
DOI:10.14569/IJACSA.2020.0110977
出版社:Science and Information Society (SAI)
摘要:The real-time monitoring and tracking systems play a critical role in the healthcare field. Wearable medical devices with sensors, mobile applications, and health cloud have continuously generated an enormous amount of data, often called streaming big data. Due to the higher speed of the streaming data, it is difficult to ingest, process, and analyze such huge data in real-time to make real-time actions in case of emergencies. Using traditional methods that are inadequate and time-consuming. Therefore, there is a significant need for real-time big data stream processing to guarantee an effective and scalable solution. So, we proposed a new system for online prediction to predict health status using Spark streaming framework. The proposed system focuses on applying streaming machine learning models (i.e. streaming linear regression with SGD) on streaming health data events ingested to spark streaming through Kafka topics. The experimental results are done on the historical medical datasets (i.e. diabetes dataset, heart disease dataset, and breast cancer dataset) and generated dataset which is simulated to wearable medical sensors. The historical datasets have shown that the accuracy improvement ratio obtained using the diabetes disease dataset is the highest one with respect to the other two datasets with an accuracy of 81%. For generated datasets, the online prediction system has achieved accuracy with 98% at 5 seconds window size. Beyond this, the experimental results have proofed that the online prediction system can online learn and update the model according to the new data arrival and window size.