期刊名称:International Journal of Statistics and Applications
印刷版ISSN:2168-5193
电子版ISSN:2168-5215
出版年度:2018
卷号:8
期号:3
页码:144-152
DOI:10.5923/j.statistics.20180803.05
语种:English
出版社:Scientific & Academic Publishing Co.
摘要:One of the important problems of data analysis is that identifying nuisance variable(s) in a data set that contributes to an increase of variability within groups in an experiment. One way to address this issue is through dimension reduction of data sets. In this study we compare between two widely used methods of reducing dimension data sets, namely the method of the principal component (PC), statistics technical that uses orthogonal transformation to convert a set of possibly correlated variables of into a new set of uncorrelated variables and the method of clustering on variables, where the aim is to put the variables with similar information in the same group or cluster by considering two celebrated data sets from literature, the leukemia dataset and the other a breast cancer data.
关键词:Acute lymphoblastic leukemia "ALL"; Breast cancer; Clustering on variables; Dimension reduction; Scree plot; Correlation matrix; Cumulative variance proportion; Principal component analysis