ocorrigan
12-15-2009, 12:28 PM
I've read several different things about appropriate sample size in PCA, with some saying that 50 is an absolute bare minimum number of cases, and others saying that the importance of sample size is attenuated provided that communalities and loadings are both high (MacCallum et al., 1999 [in journal 'Psychological Methods']).
My problem is this: I'm doing a comparative study on a set of 16 European countries; the sample size has to be this small as this set of countries are the ones relevant to the criteria defined by my research question; the data I need (9 categories of 10 variables each, total 90 variables) only exist for one time point, so I can't pool several years.
What I need to do is: run PCAs for the 16 countries separately for each of the 9 categories, i.e. each individual analysis of principal components is reliant on 10 or fewer variables across 16 cases, with the aim being to reduce the dimensionality of the vector space within each category to one or two dimensions.
Questions: is it acceptable to perform this kind of analysis given the size of the sample AND given that the exercise is not directed towards probabilistic inference from a sample to a population but, rather, is describing that population itself? KMO in most of the categories is within the acceptable range; performing the analysis and dropping some variables with low communalities generates satisfactory solutions of one or two dimensions. Basically, the analysis works and looks fine to me, but I'm concerned that an examiner might question the viability/feasibility/acceptability of performing PCA on 16 cases. Please advise.
My problem is this: I'm doing a comparative study on a set of 16 European countries; the sample size has to be this small as this set of countries are the ones relevant to the criteria defined by my research question; the data I need (9 categories of 10 variables each, total 90 variables) only exist for one time point, so I can't pool several years.
What I need to do is: run PCAs for the 16 countries separately for each of the 9 categories, i.e. each individual analysis of principal components is reliant on 10 or fewer variables across 16 cases, with the aim being to reduce the dimensionality of the vector space within each category to one or two dimensions.
Questions: is it acceptable to perform this kind of analysis given the size of the sample AND given that the exercise is not directed towards probabilistic inference from a sample to a population but, rather, is describing that population itself? KMO in most of the categories is within the acceptable range; performing the analysis and dropping some variables with low communalities generates satisfactory solutions of one or two dimensions. Basically, the analysis works and looks fine to me, but I'm concerned that an examiner might question the viability/feasibility/acceptability of performing PCA on 16 cases. Please advise.