出版社:Academy & Industry Research Collaboration Center (AIRCC)
摘要:This paper uses a case based study – “product sales estimation” on real-time data tounderstand the applicability of linear and non-linear models. We use a systematic approach toaddress the given problem statement of sales estimation for a given product by applying bothlinear and non-linear techniques on a data set of selected features from the original data set.Feature selection is a process that reduces the dimensionality of the data set by eliminatingthose features which contribute minimal to the prediction of the dependent variable. The nextstep is training the model which is done using two techniques from linear & non-lineardomains, one of the best ones in their respective areas. Data Re-modeling has then been done toextract new features from the data set by changing the structure of the dataset & theperformance of the models is checked again. Data Remodeling often plays a crucial role inboosting classifier accuracies by changing the properties of the dataset. We then try to analyzethe reasons due to which one model proves to be better than the other & hence try and developan understanding about the applicability of linear & non-linear models. The target mentionedabove being our primary goal, we also aim to find the classifier with the best possible accuracyfor product sales estimation in the given scenario.
关键词:Machine Learning; Prediction; Linear and Non-linear models; Linear Regression; Random;Forest; Dimensionality Reduction; Feature Selection; Homoscedasticity.