There are two categories of general recommendations in terms of minimum sample size in factor analysis. One category says that the absolute number of cases (N) is important, while the another says that the subject-to-variable ratio (p) is important. Arrindell and van der Ende (1985), Velicer and Fava (1998), and MacCallum, Widaman, Zhang and Hong (1999) have reviewed many of these recommendations.
Sample size
Rule of 100: Gorsuch (1983) and Kline (1979, p. 40) recommanded at least 100 (MacCallum, Widaman, Zhang & Hong, 1999). No sample should be less than 100 even though the number of variables is less than 20 (Gorsuch, 1974, p. 333; in Arrindell & van der Ende, 1985, p. 166);
Hatcher (1994) recommanded that the number of subjects should be the larger of 5 times the number of variables, or 100. Even more subjects are needed when communalities are low and/or few variables load on each factor (in David Garson, 2008).
Rule of 150: Hutcheson and Sofroniou (1999) recommends at least 150 - 300 cases, more toward the 150 end when there are a few highly correlated variables, as would be the case when collapsing highly multicollinear variables (in David Garson, 2008).
Rule of 200. Guilford (1954, p. 533) suggested that N should be at least 200 cases (in MacCallum, Widaman, Zhang & Hong, 1999, p84; in Arrindell & van der Ende, 1985; p. 166).
Rule of 250. Cattell (1978) claimed the minimum desirable N to be 250 (in MacCallum, Widaman, Zhang & Hong, 1999, p84).
Rule of 300. There should be at least 300 cases (Noru?is, 2005: 400, in David Garson, 2008).
Significance rule. Lawley and Maxwell (1971) suggested 51 more cases than the number of variables, to support chi-square testing (in David Garson, 2008).
Rule of 500. Comrey and Lee (1992) thought that 100 = poor, 200 = fair, 300 = good, 500 = very good, 1,000 or more = excellent They urged researchers to obtain samples of 500 or more observations whenever possible (in MacCallum, Widaman, Zhang & Hong, 1999, p84).
Subjects-to-variables (STV)ratio
A ratio of 20:1. Hair, Anderson, Tatham, and Black (1995, in Hogarty, Hines, Kromrey, Ferron, & Mumford, 2005)
Rule of 10. There should be at least 10 cases for each item in the instrument being used. (David Garson, 2008; Everitt, 1975; Everitt, 1975, Nunnally, 1978, p. 276, in Arrindell & van der Ende, 1985, p. 166; Kunce, Cook, & Miller, 1975, Marascuilor & Levin, 1983, in Velicer & Fava, 1998, p. 232)
Rule of 5. The subjects-to-variables ratio should be no lower than 5 (Bryant and Yarnold, 1995, in David Garson, 2008; Gorsuch, 1983, in MacCallum, Widaman, Zhang & Hong, 1999; Everitt, 1975, in Arrindell & van der Ende, 1985; Gorsuch, 1974, in Arrindell & van der Ende, 1985, p. 166)
A ratio of 3(:1) to 6(:1) of STV is acceptable if the lower limit of variables-to-factors ratio is 3 to 6. But, the absolute minimum sample size should not be less than 250.(Cattell, 1978, p. 508, in Arrindell & van der Ende, 1985, p. 166)
Ratio of 2. "[T]here should be at least twice as many subjects as variables in factor-analytic investigations. This means that in any large study on this account alone, one should have to use more than the minimum 100 subjects" (Kline, 1979, p. 40).
Statistical Research Findings on Minimum Sample Size
Little statistical research in the fields of Education and Behaviour Science has shed light on the issue of establishing a minimum desirable level of sample size (MacCallum, Widaman, Zhang & Hong, 1999). These studies used either artificial or empirical data to investigate the minimum sample size or STV ratio that is required in order to recover the population factor structure. In this section, I will summarize the minimum sample size and STV ratio that these studies had examined.
Barrett and Kline (1981, in MacCallum, Widaman, Zhang & Hong, 1999) used two large empirical data sets to investigate this issue. They drew sub-samples of various size from the original full samples and performed factor analysis with each sub-sample to compare the results of sub-samples with the result of full samples. They obtained good recovery:
from a sub-sample of N = 48 for one data set that has 16 variables, which represents a STV ratio of 3.0;
and from a sub-sample of N = 112 for another data set that has 90 variables, which STV ratio is 1.2.Icon. This number was reported as 50 "to be the minimum to yield a clear, recognizable factor pattern" (p. 167) in Arrindell and van der Ende's paper (1985).
Arrindell and van der Ende (1985) used two large empirical data sets that have 1104 cases and 960 cases respectively to examine the minimum sample sizes and STV ratios that can produce stable factor structure. By drawing sub-samples from the two large data sets, the authors found that:
for the first data set, which had 76 variables, the minimum STV ratio (p) that required to produce clear, recognizable factor solution was 1.3 and the corresponding sample size (N) was 100;
for the second data set, which has 20 variables, the minimum STV ratio (p) was 3.9 and the corresponding sample size (N) was 78.
MacCallum, Widaman, Zhang & Hong (1999) conducted a Monte Carlo Study on sample size effects. They obtained an excellent recovery (100% convergence) of population factor structure with a sample size (N) of 60 and 20 variables. However, this result was obtained only when the level of communality (over .7 in average) and overdetermination (3 loaded factors) were high (Table 1 on page 93).
Preacher & MacCallum (2002) conducted a Monte Carlo study. Their conclusion is:
N had by far the largest effect on factor recovery, which exhibited a sharp drop-off below N s of 20 or so. (p.157)
The Minimum Sample Size or STV Ratio Used in Practical Studies
Henson and Roberts (2006) reported a review of 60 exploratory factor analysis in four journals: Educational and Psychological Measurement, Journal of Educational Psychology, Personality and Individual Differences, and Psychological Assessment.
Minimum sample size reported: 42.
Minimum STV ratio reported: 3.25:1; 11.86% of reviewed studies used a ratio less than 5:1.
Fabrigar, Wegener, MacCallum, and Strahan (1999) reported a review of articles that used EFA in two journals: Journal of Personality and Social Psychology (JPSP) and Journal of Applied Psychology (JAP).
Sample size: 30 (18.9%) articles in JPSP and 8 (13.8%) in JAP were 100 or less.
Ratio of variable to factors: 55 (24.6%) papers in JPSP and 20 (34.4%) in JAP were 4:1 or less
Ford, MacCallum, and Tait (1986) examined articles published in Journal of Applied Psychology, Personnel Psychology, and Organizational Behavior and Human Performance during the period of 1974 - 1984.
RTV ratio: 27.3% of the studies were less than 5:1, 56% were less than 10:1.
Factors Related to Sample Size
Research has demonstrated that the general rule of thumb of the minimum sample size are not valid and useful (MacCallum, Widaman, Zhang, & Hong, 1999; Preacher & MacCallum, 2002). It is hard and simplicity to say whether absolute sample size is important or the STV ratio is important in factor analysis. The minimum level of N (sample size) was dependent on other aspects of design, such as:
Communality of the variables
The communality measures the percent of variance in a given variable explained by all the factors jointly and may be interpreted as the reliability of the indicator (Gason, 2008).
If communalities are high, recovery of population factors in sample data is normally very good, almost regardless of sample size, level of overdetermination, or the presence of model error (MacCallum, Widaman, Preacher, and Hong, 2001, p. 636)
MacCallum, Widaman, Zhang, and Hong (1999) suggested communalities should all greater than .6, or the mean level of communality to be at least .7 (p. 96).
Item communalities are considered "high" if they are all .8 or greater - but this is unlikely to occur in real data (Costello & Osborne, 2005, p. 4).
Degree of overdetermination of the factor (or number of factors/number of variables)
Overdetermination is the factor-to-variable ratio (Preacher & MacCallum, 2002).
Six or seven indicators per factor and a rather small number of factors is considered as high overdetermination of factors if many or all communalities are under .50 (MacCallum, Widaman, Zhang, & Hong, 1999).
A minimum of 3 variables per factor is critical. This confirms the theoretical results of T. W. Anderson and Rubin (1956; also see McDonald & Krane, 1977, 1979, and Rindskopf, 1984). (Velicer, & Fava, 1998, p. 243).
At least four measured variables for each common factor and perhaps as many as six (Fabrigar, Wegener, MacCallum, & Strahan, 1999, p. 282)
A factor with fewer than three itmes is generally weak and unstable (Costello & Osborne, 2005, p. 5)
Size of loading
Item loading magnitude accounted for significant unique variance in the expected direction in all but one case, and in most cases was the strongest unique predictor of congruence between sample and population (Osborne, & Costello, 2004).
The sample-to-population pattern fit was very good for the high (.80) loading condition, moderate for the middle (.60) loading condition, and very poor (.40) for the low loading condition (Velicer & Fava, 1998).
5 or more strongly loading items (.50 or better) are desirable and indicate a solid factor (Costello & Osborne, 2005, p. 5).
If components possess four or more variables with loadings above .60, the pattern may be interpreted whatever the sample size used . Similarly, a pattern composed of many variables per component (10 to 12) but low loadings (= .40) should be an accurate solution at all but the lowest sample sizes (N < 150). If a solution possesses components with only a few variables per component and low component loadings, the pattern should not be interpreted unless a sample size of 300 or more observations has been used. (Guadagnoli & Velicer, 1988, p. 274)
Model fit (f)
It is defined in terms of the population root mean squared residual (RMSR) (Preacher & MacCallum, 2002).
RMSR = .00, .03, .06, respectively correspond to perfect, good, and fair model fit in the population (Preacher & MacCallum, 2002).
Lack of fit of the model in the population will not, on the average, influence recovery of population factors in analysis of sample data, regardless of degree of model error and regardless of sample size (MacCallum, Widaman, Preacher, & Hong, 2001, p. 611).
Model fit has little effect on factor recovery. It is probably very rare in practice to find factor models exhibiting simultaneously high communalities and poor fit (Preacher & MacCallum, 2002, p. 157).
the differences between (extraction) methods with respect to ability to reproduce the population pattern were generally minor (Velicer & Fava, 1998, p. 243)