期刊名称:International Journal of Software Engineering and Its Applications
印刷版ISSN:1738-9984
出版年度:2016
卷号:10
期号:1
页码:35-42
DOI:10.14257/ijseia.2016.10.1.04
出版社:SERSC
摘要:Predictive modeling is the process of creating a statistical model from data with the purpose of predicting future behavior. In recent years, the amount of available data has increased exponentially and "Big Data Analysis" is expected to be at the core of most future innovations. Due to the rapid development in the field of data analysis, there is still a lack of consensus on how one should approach predictive modeling problems in general. Another innovation in the field of predictive modeling is the use of data analysis competitions for model selection. This competitive approach is interesting and seems fruitful, but one could ask if the framework provided by for example Gane Project based on big data framework gives a trustworthy resemblance of real-world predictive modeling problems. In this thesis, we will state and test a set of hypotheses about predicative modeling, both in general and in the scope of data analysis competitions. We will then describe a conceptual big data framework for approaching predictive modeling problems. To test the validity and usefulness of this framework, we will participate in a series of predictive modeling competitions on the platform provided by Gane, and describe our approach to these competitions.
关键词:Conceptual Predictive Modeling; Gane system; Big data Framework