出版社:Asociatia Generala a Economistilor din Romania - AGER
摘要:Principal components analysis (PCA) is a multivariate data analysis technique whose main purpose is to reduce the dimension of the observations and thus simplify the analysis and interpretation of data, as well as facilitate the construction of predictive models. A rigorous definition of PCA has been given by Bishop (1995) and it states that PCA is a linear dimensionality reduction technique, which identifies orthogonal directions of maximum variance in the original data, and projects the data into a lower-dimensionality space formed of a sub-set of the highest-variance components. PCA is commonly used in economic research, as well as in other fields of activity. When faced with the complexity of economic and financial processes, researchers have to analyze a large number of variables (or indicators), fact which often proves to be troublesome because it is difficult to collect such a large amount of data and perform calculations on it. In addition, there is a good chance that the initial data is powerfully correlated; therefore, the signification of variables is seriously diminished and it is virtually impossible to establish causal relationships between variables. Researchers thus require a simple, yet powerful annalytical tool to solve these problems and perform a coherent and conclusive analysis. This tool is PCA.The essence of PCA consists of transforming the space of the initial data into another space of lower dimension while maximising the quantity of information recovered from the initial space(1). Mathematically speaking, PCA is a method of determining a new space (called principal component space or factor space) onto which the original space of variables can be projected. The axes of the new space (called factor axes) are defined by the principal components determined as result of PCA. Principal components (PC) are standardized linear combinations (SLC) of the original variables and are uncorrelated. Theoretically, the number of PCs equals the number of initial variables, but the whole point of PCA is to extract as few factors as possible without compromising the variability of the original space. An important property of the PCs is that the first PC is extracted so as to recover the variance from the initial space to the maximum possible extent. The remaining variance is recovered by the next PCs at a declining rate: the variance of the second PC is greater than the variance of the third PC, the variance of the third PC is greater than the variance of the fourth PC and so on.
关键词:causal space; principal components; eigen values; variance; insurance; factor matrix; generalized variance.