期刊名称:International Journal of Advanced Computer Science and Applications(IJACSA)
印刷版ISSN:2158-107X
电子版ISSN:2156-5570
出版年度:2017
卷号:8
期号:7
DOI:10.14569/IJACSA.2017.080715
出版社:Science and Information Society (SAI)
摘要:Real time analytics is the capacity to extract valuables insights from data that comes continuously from activities on the web or network sensors. It is largely used in web based business to drive decisions based on user’s experiences, such dynamic pricing and personalized advertising. Many universities have adopted web based learning in their learning process. They use data-mining techniques to better understand students’ behavior, and most of the tools developed are based on historical and stored data, and do not allow real time reactivity. Online activities of learners generate at high speed a huge amount of data in form of users’ interactions which have all characteristics to be considered as Big data. Deal with volume and velocity of these data in order to inform and enable decisions-makers to act at right time lead us to use new methods to capture E-Learning data, and process it in real time. This paper focuses on the design and implementation of modern and hybrid real time data pipeline architecture using Apache Flume to collect data, Apache Spark as an unified engine computation for performing analytics on students’ activities data and Apache Hive as a data warehouse for storing the processed data and for use by various reporting tools. To conceive this platform we conduct an experiment on Moodle database source.
关键词:Real time analytics; e-learning; big data; Hadoop; spark; Moodle; change data capture; streaming; data visualization clustering