Faktoranalyse

Eine vergleichende Abgrenzung der Arten von Faktoranalysen sowie übergreifender Themen zu den verschiedenen Arten der FA findet sich unter fa

\[ x_1 = l_{11}f_1 + l_{12}f_2 + u_1 \]

\[ s_1^2 = l_{11}^2 + l_{12}^2 + var(u_1) \]

ML Analyse optimiert die Hauptachsen-Faktoranalyse mathematisch auf das optimale Modell.

Ein paar prinzipielle Bemerkungen

Die FA hat inhaltlich eine andere Zielsetzung als die PCA: Auffinden bzw. Erfassen latenter Konstrukte (latent Traits), die nicht direkt beobachtbar sind durch manifeste, messbare und damit beobachtbare Variablen. Frage: Reflektieren die erfassten Variablen eine alle gemeinsam beeinflussende Größe.

Alles für die PCA Gesagte behält seine Gültigkeit. Man kann die PCA als einen Spezialfall der FA auffassen.

Der Kernunterschied zur PCA ist, dass Restvarianzen bei den Variablen zugelassen werden (uniquenesses). Die Gesamtvarianz der Variable wird zerlegt in die gemeinsame Varianz (die auf den zugrunde liegenden Faktor zurückgeht), die spezifische Varianz (die spezifisch für dieses Item ist z. B. eine ganz bestimmte Unterkomponente des Konstruktes (z. B. Intelligenz) erfasst) und die Messfehlervarianz, die im psychologischen Umfeld normal ist.

Im Vergleich zur FA versucht die PCA grundsätzlich die Gesamtvarianz zu erklären, was im psychologischen Umfeld meist inadäquat ist. Daher werden die PCA Lösungen oft auch unstabiler als die FA Lösungen, beispielsweise bei Kreuzvalidierungen.

Eine gefundenen Struktur wird rotiert, um eine möglichst gute Interpretation der Faktoren zu ermöglichen. Es werden, je nach Rotationstyp, auch Abhängigkeiten zwischen den Faktoren zugelassen.

Gesamt-n ist sehr viel wichtiger, als Daumenregeln zum Verhältnis von Vpn zu Variablenzahl (10 - 15 pro Variable). Nach Field (2013, p. 684)field sind 300 Vpn ordentlich, 100 Vpn eher mager und 1000 excellent, relativ unabhängig von der Anzahl der zu faktorisierenden Variablen. Dies zeigt sich in der Stabilität der Faktorstruktur. Aber Faktorladungen und Zahl der Variablen mit diesen Ladungen größer Grenze sind zusätzlicher Einfluss. 4 oder mehr Ladungsgewichte > 0.6 oder 10 Ladungsgewichte > 0.40 jeweils pro Faktor fürhen bereits bei ca 150 Beobachtungen zu relativ stabilen und damit interpretierbaren Faktoren. Auch die Kommunalitäten spielen hier eine Rolle: Alle Kommunalitäten > .6 kann bereits ab ca 150 Vpn zu relativ verlässlichen Ergebnissen führen. Kommunalitäten durchgehend > .50 kann mit 100 - 200 Vpn schon relativ stabil werden.

Das Kaiser-Meyer-Olkin schwankt zwischen 0 und 1. Empfohlen wird eine Faktorisierung erst bei Werten von über 0.5 (Field, 2013)field. Vgl. hierzu auch die Ausführungen zu KMO in fa Rmd html

Bartlett’s test prüft, ob die vorliegende Korrelationsmatrix sig. verschieden von der Identity-Matrix (alle Nicht-Diagonal-Korrelationen sind null) ist.

Parallalelanalyse als Ansatz für die Festlegung der Faktoranzahl. Vgl. hierzu auch die Ausführungen zu Parallelanalyse in fa Rmd html

Field (2012, p. 755)

Bei orthogonaler Rotation sind die Gewichte der Variablen in den jeweiligen Faktor (coefficients) gleich der Korrelation dieser Variablen mit dem Faktor. Bei obliquer Rotation finden sich die Korrelationen der Variablen mit dem Faktor in der Strukturmatrix (structure matrix), während sich die Ladungsgewichte in der Ladungsmatrix (pattern matrix) finden.

Faktorwerte (factor scores) sind individuelle Ausprägungen der Faktoren für eine Beobachtung (Person). Bei Verwendung der (ungewichteten) b-Gewichte (Matrix A) sind die Werte skalenabhängig und die Korrelation zwischen den Messwerten wird nicht berücksichtigt. Gewichtete Ladungskoeffizienten können auf verschiedene Arten gebildet werden. Teilt man Matrix A durch die Korrelationsmatrix (R): $B = A * R^{-1} $, erhält man die bereinigten Ladungskoeffizienten{}? (unique relationship). (Ungewichtete und Gewichtete Faktorwerte).

Faktorwerte können benutzt werden, um Unterschiede zwischen Personen sparsamer zu beschreiben, als mit den Originalvariablen. Faktorwerte können bei Kollinearitätsproblemen ein Lösungsansatz sein (PCA).

Orthogonale Rotation

Die Faktoren bleiben unabhängig (orthogonal). Faktoren müssen ‘real’ auch unkorreliert sein.

Varimax-Rotation Varianz der quadrierten Faktorladungen pro Faktor (innerhalb) wird maximiert wird. Man möchte pro Faktor möglichst viele betraglich hohe oder um Null liegende Ladungen erhalten. Faktor besser interpretierbar als das Gemeinsame der hoch auf ihm ladenden Variablen.

Quatrimax-Rotation Pro Variable sollen über die Faktoren hinweg möglichst viele hohe oder Null-Ladungen auftreten, um die Variablen klarer den Faktoren zuordnen zu können (zwischen). Klare Zuordnung von Variablen zu Faktoren möglich. Führt häufig dazu, dass eine Variable in einen einzigen Faktor hoch lädt.

Equamax ist ein Hybrid aus beiden und wird teils nicht empfohlen (Field, 2013).

Oblique Rotation

In SPSS wie auch in Statistica: direct oblimin und promax.

Promax ist eine sog. Target-Rotationsmethode, d. h. es kann ein gewünschtes Kriteriums-Ladungsmuster vorge- geben werden. Wird kein bestimmtes Muster vorgegeben, rotiert Promax so, daß möglichst klar voneinander getrennte Item-Gruppen entstehen (ähnlich Varimax, aber eben mit potenziell korrelierten Faktoren).

Direct Oblimin versucht, die Kovarianzen zwischen den quadrierten Faktorladungen aller Paare von Faktoren zu minimieren.

Rotation in R

In R: Aus der Hilfe zu psych::fa() “none”, “varimax”, “quartimax”, “bentlerT”, “geominT” and “bifactor” are orthogonal rotations. “promax”, “oblimin”, “simplimax”, “bentlerQ,”geominQ" and “biquartimin” and “cluster” are possible rotations or transformations of the solution. The default is to do a oblimin transformation, although versions prior to 2009 defaulted to varimax.

R hat auch eigene Packages zur Faktorrotation, z. B. library(GPArotation).

Faktorladungen

Koeffizienten für jede Variable, die ausdrücken, welchen Einfluss die Variable auf die Bildung des Faktors hat. In PCA wenn von der Korrelationsmatrix aus gerechnet wurde: Korrelation der Variablen mit den Faktorwerten (Werte der Personen in diesem Faktor). Auch berechenbar über die Multiplikation der Loadings (Eigenvektor) mit der Standardabweichung des jeweiligen Faktors.

Faktorwerte

Werte der Vpn in dem jeweiligen Faktor. In R ergebnisobjekt$scores. In Statistica Faktorwerte genannt.

Eigenwerte der Hauptkomponenten

Eigenwert ist die Summe der quadrierten Ladungen eines Faktors (Faktorladungen) über alle Variablen. Eigenwert ist die Varianz dieses Faktors. Da die Gesamtvarianz der Variablen auf 1 gesetzt wird ist der Eigenwert zugleich der Anteil, den der Faktor an der Gesamtvarianz der beobachteten Variablen erklärt. Die Eigenwerte sind die quadrierten Standardabweichungen der Hauptkomponenten, die R im Summary zur PCA ausgibt (ergebnisobjekt$sdev). In der PCA ist die Gesamtsumme der Eigenwerte = der Menge der Hauptkomponenten. Der Mittelwert der Eigenwerte ist also 1.

Eigenvektor - Loadings

Statistica-Ergebnisdialog: Variablen | Eigenvektor entspricht R-Loadings ergebnisobjekt$loadings Die Summe der quadrierten Loadings über alle Variablen hinweg ergibt 1. Mit Hilfe der Eigenvektoren können die vorhergesagten Werte (Faktorwerte) errechnet werden.

Kommunalität

Parameter der Variablen. Die Summe der quadrierten Ladungen einer Variablen auf allen Faktoren ergibt die Varianz dieser Variablen, die durch die Faktoren gemeinsam erklärt wird. Diese Größe wird als Kommunalität h^2j einer Variablen j bezeichnet.

Einfachstruktur

Ziel der Rotation. Für alle Variablen soll erreicht werden, dass sie möglichst nur in einen Faktor hoch laden und in alle anderen Faktoren sehr niedrig. Hierdurch soll eine möglichst gute (einfache) Interpretation bzw. Benennung der Faktoren ermöglicht werden.

Reduzierte Korrelationsmatrix als Ausgangsmatrix bei FA

In die Diagonale kommen nicht, wie bei PCA, 1-en sondern Schätzer für die Kommunalitäten. Üblich sind die SMC (squared multiple correlations). Idee hier: Ein Item wird via MR aus allen verbleibenden Items vorhergesagt. R^2 ist der SMC-Wert für dieses Item.

Hier demonstriert am Beispiel der werner-fa, generierte Werte von 15 Items und für 100 Beobachtungen (vgl. Beispiel weiter unten).

items <- read.delim(file="http://r.psych.bio.uni-goettingen.de/mv/data/div/werner-fa.txt")
head(round(items, 2))

##      V1    V2    V3    V4    V5    V6    V7    V8    V9  V10   V11   V12
## 1  2.03 -1.18 -0.29 -1.01 -0.31  0.36  0.42  2.07  0.90 0.56 -0.99 -0.74
## 2  0.34  0.51  1.13 -0.73  0.13 -0.25 -0.79  1.02  0.51 0.18 -0.83 -0.50
## 3  1.16  1.77  1.49  1.20  0.30  1.03  1.18  0.45  0.58 0.20 -1.80  0.33
## 4  0.08  0.50  0.78  2.43 -0.34 -1.34 -1.19 -0.96  1.34 0.07 -0.13  0.73
## 5 -0.26  1.15  1.46 -0.11  0.23 -0.56 -0.53 -0.47  0.06 0.31 -0.85 -0.83
## 6 -0.38  0.72 -0.07  0.77  1.09 -0.59  0.99 -2.13 -2.53 0.06  1.54 -0.16
##     V13   V14   V15
## 1 -0.68 -1.68 -0.10
## 2  1.09  1.14  0.10
## 3 -0.49  0.22  1.16
## 4  0.31 -0.49 -0.74
## 5 -1.83 -0.41 -0.53
## 6  0.69  0.88 -0.48

# take a look at correlation table
head(round(cor(items), 2))

##      V1   V2   V3   V4   V5   V6   V7   V8   V9  V10   V11  V12  V13  V14
## V1 1.00 0.32 0.26 0.32 0.30 0.13 0.21 0.18 0.24 0.12  0.00 0.01 0.02 0.09
## V2 0.32 1.00 0.41 0.26 0.34 0.03 0.03 0.03 0.15 0.16 -0.04 0.17 0.03 0.10
## V3 0.26 0.41 1.00 0.28 0.29 0.13 0.13 0.07 0.26 0.12  0.08 0.26 0.03 0.24
## V4 0.32 0.26 0.28 1.00 0.29 0.14 0.19 0.05 0.21 0.24  0.13 0.32 0.23 0.19
## V5 0.30 0.34 0.29 0.29 1.00 0.15 0.27 0.15 0.38 0.34  0.22 0.29 0.10 0.14
## V6 0.13 0.03 0.13 0.14 0.15 1.00 0.22 0.32 0.28 0.36  0.03 0.28 0.07 0.17
##     V15
## V1 0.19
## V2 0.15
## V3 0.12
## V4 0.15
## V5 0.22
## V6 0.18

# insert SMC
items.cors.reduced <- cor(items)
# put smc into the diagonal
require(psych)

## Loading required package: psych

diag(items.cors.reduced) <- smc(items)
# take a look at the resulting table
head(round(items.cors.reduced, 2))

##      V1   V2   V3   V4   V5   V6   V7   V8   V9  V10   V11  V12  V13  V14
## V1 0.29 0.32 0.26 0.32 0.30 0.13 0.21 0.18 0.24 0.12  0.00 0.01 0.02 0.09
## V2 0.32 0.29 0.41 0.26 0.34 0.03 0.03 0.03 0.15 0.16 -0.04 0.17 0.03 0.10
## V3 0.26 0.41 0.29 0.28 0.29 0.13 0.13 0.07 0.26 0.12  0.08 0.26 0.03 0.24
## V4 0.32 0.26 0.28 0.29 0.29 0.14 0.19 0.05 0.21 0.24  0.13 0.32 0.23 0.19
## V5 0.30 0.34 0.29 0.29 0.36 0.15 0.27 0.15 0.38 0.34  0.22 0.29 0.10 0.14
## V6 0.13 0.03 0.13 0.14 0.15 0.26 0.22 0.32 0.28 0.36  0.03 0.28 0.07 0.17
##     V15
## V1 0.19
## V2 0.15
## V3 0.12
## V4 0.15
## V5 0.22
## V6 0.18

Chi^2 Test ob die Anzahl der Faktoren ausreicht

vgl. fa Rmd html

Faktoranalyse in R

Base Package: factanal()

Für Datenmatrix dd, zwei zu extrahierende Faktoren und Maximum-Likelyhood-Extraktion: factanal(dd,factors=2,method="mle")

Besser, mächtiger und flexibler und hier durchgängig verwendet: Das Package psych library(psych) und hier die Funktion fa()

Hier eine kurze Darstellung der wichtigsten Parameter

require(psych)
fa(
  data,     # Datenmatrix (oder Korrelationstabelle, die aber auch automatisch generiert wird)
  nfactors=3,   # Anzahl der Faktoren
  fm="ml",  # Extraktionsverfahren (factoring method) ml (maximum likelyhood), 
        # fm="minres" will do a minimum residual (OLS), 
        # fm="wls" will do a weighted least squares (WLS) solution, 
        # fm="gls" does a generalized weighted least squares (GLS), 
        # fm="pa" will do the principal factor solution, 
        # fm="ml" will do a maximum likelihood factor analysis
  rotate="varimax",
        # "none", "varimax", "quartimax", "bentlerT", and "geominT" are orthogonal rotations. 
        # "promax", "oblimin", "simplimax", "bentlerQ, and "geominQ" or "cluster" are possible rotations or transformations of the solution. 
        # The default is to do a oblimin transformation, although prior versions defaulted to varimax.
  SMC=TRUE  # SMC    Use squared multiple correlations (SMC=TRUE) or use 1 as initial communality estimate. 
        # Try using 1 if imaginary eigen values are reported.
  max.iter=100  # Iterationsstufen
  scores=TRUE   # Factorscores mit berechnen. Default ist scores=FALSE
  )

Beipiel: Wahrnehmung eigener und fremder Gefühle (Emotionale Intelligenzforschung)

Das Beispiel lehnt sich an eine Studie von Tanja Lischetzke et al (2001, Diagnostica Vol 47, No. 4, S. 167 - 177):

Aus dem Abstract: “Das Erkennen der eigenen Gefühle und der Gefühle anderer Menschen ist eine wichtige Kompetenz im Umgang mit Emotionen und Stimmungen. Es werden die bisher vor allem im englischen Sprachraum untersuchten Konstrukte der emotionalen Selbstaufmerksamkeit und der Klarheit über eigene Gefühle vorgestellt und die konzeptuelle Trennung der Konstrukte erstmals auf die Wahrnehmung fremder Gefühle übertragen.”

Die Ausführungen hier beziehen sich allerdings nur auf den Teil der Wahrnehmung eigener Gefühle.

Ein erfundener Datensatz findet sich im üblichen tab-delimited Textformat unter: fa-ei.txt

Die Fragen/Items sind:

i1 Ich denke über meine Gefühle nach.
i2 Ich kann meine Gefühle benennen.
i3 Ich schenke meinen Gefühlen Aufmerksamkeit.
i4 Ich bin mir im unklaren darüber, was ich fühle.
i5 Ich beschäftige mich mit meinen Gefühlen.
i6 Ich habe Schwierigkeiten, meine Gefühle zu beschreiben.
i7 Ich denke darüber nach, wie ich mich fühle.
i8 Ich weiß, was ich fühle.
i9 Ich beobachte meine Gefühle.
i10 Ich habe Schwierigkeiten, meinen Gefühlen einen Namen zu geben.
i11 Ich ache darauf, wie ich mich fühle.
i12 Ich bin mir unsicher, was ich eigentlich fühle.

Die Items sind skaliert von 1 bis 4 (fast nie/manchmal/oft/fast immer)

Die beiden latenten Konstrukte (latent Traits) um die es hier geht, sind:

emotionalen Selbstaufmerksamkeit eSA
Klarheit über eigene Gefühle eC

Auswertung

# get data
ddf <- read.delim("http://r.psych.bio.uni-goettingen.de/mv/data/virt/v_ei.txt")
# take a look at the data
head(ddf)

##   i1 i2 i3 i4 i5 i6 i7 i8 i9 i10 i11 i12
## 1  4  3  4  2  3  1  4  3  3   1   3   3
## 2  4  3  3  1  4  1  4  4  4   1   3   2
## 3  4  3  4  2  4  2  4  2  4   1   3   3
## 4  3  3  2  2  1  2  4  3  1   2   2   3
## 5  4  4  4  4  4  1  4  4  4   2   3   1
## 6  2  4  3  2  4  2  2  3  3   2   3   2

# we need library(psych)
require(psych)

# some descriptives
psych::describe(ddf)

##     vars   n mean   sd median trimmed  mad min max range  skew kurtosis
## i1     1 100 3.06 0.93      3    3.15 1.48   1   4     3 -0.56    -0.77
## i2     2 100 3.08 0.97      3    3.20 1.48   1   4     3 -0.68    -0.68
## i3     3 100 3.02 0.86      3    3.09 1.48   1   4     3 -0.50    -0.56
## i4     4 100 1.87 0.96      2    1.71 1.48   1   4     3  0.94    -0.10
## i5     5 100 3.00 0.93      3    3.09 1.48   1   4     3 -0.52    -0.75
## i6     6 100 1.91 0.96      2    1.77 1.48   1   4     3  0.78    -0.44
## i7     7 100 2.91 0.92      3    2.98 1.48   1   4     3 -0.36    -0.86
## i8     8 100 3.05 1.04      3    3.19 1.48   1   4     3 -0.74    -0.71
## i9     9 100 3.01 0.90      3    3.08 1.48   1   4     3 -0.42    -0.88
## i10   10 100 1.94 0.99      2    1.81 1.48   1   4     3  0.67    -0.74
## i11   11 100 2.90 0.75      3    2.91 0.00   1   4     3 -0.28    -0.27
## i12   12 100 1.86 0.92      2    1.76 1.48   1   4     3  0.66    -0.71
##       se
## i1  0.09
## i2  0.10
## i3  0.09
## i4  0.10
## i5  0.09
## i6  0.10
## i7  0.09
## i8  0.10
## i9  0.09
## i10 0.10
## i11 0.07
## i12 0.09

# correlation matrix is base of EFA and with this the solution the reproduced correlation matrix will be compared
# covariance matrix, wrap with round() for better reading
round(var(ddf), 2)

##        i1    i2    i3    i4    i5    i6    i7    i8    i9   i10   i11
## i1   0.87  0.17  0.28 -0.05  0.15 -0.21  0.16  0.14  0.36 -0.08  0.21
## i2   0.17  0.94  0.12 -0.44  0.16 -0.49  0.19  0.57  0.16 -0.59  0.09
## i3   0.28  0.12  0.75 -0.06  0.24 -0.09  0.10  0.22  0.26 -0.21  0.21
## i4  -0.05 -0.44 -0.06  0.92 -0.11  0.58 -0.10 -0.45 -0.10  0.51 -0.10
## i5   0.15  0.16  0.24 -0.11  0.87 -0.04  0.15  0.30  0.22 -0.12  0.23
## i6  -0.21 -0.49 -0.09  0.58 -0.04  0.93 -0.04 -0.47 -0.16  0.56 -0.03
## i7   0.16  0.19  0.10 -0.10  0.15 -0.04  0.85  0.18  0.14 -0.14  0.09
## i8   0.14  0.57  0.22 -0.45  0.30 -0.47  0.18  1.08  0.21 -0.59  0.14
## i9   0.36  0.16  0.26 -0.10  0.22 -0.16  0.14  0.21  0.82 -0.17  0.17
## i10 -0.08 -0.59 -0.21  0.51 -0.12  0.56 -0.14 -0.59 -0.17  0.99 -0.13
## i11  0.21  0.09  0.21 -0.10  0.23 -0.03  0.09  0.14  0.17 -0.13  0.56
## i12 -0.09 -0.53 -0.04  0.48 -0.09  0.40 -0.02 -0.51 -0.11  0.44 -0.12
##       i12
## i1  -0.09
## i2  -0.53
## i3  -0.04
## i4   0.48
## i5  -0.09
## i6   0.40
## i7  -0.02
## i8  -0.51
## i9  -0.11
## i10  0.44
## i11 -0.12
## i12  0.85

# correlation matrix, wrap with round() for better reading
round(cor(ddf), 2)

##        i1    i2    i3    i4    i5    i6    i7    i8    i9   i10   i11
## i1   1.00  0.18  0.35 -0.06  0.17 -0.23  0.18  0.14  0.43 -0.08  0.30
## i2   0.18  1.00  0.14 -0.48  0.18 -0.52  0.21  0.57  0.18 -0.61  0.12
## i3   0.35  0.14  1.00 -0.07  0.30 -0.11  0.13  0.25  0.34 -0.25  0.33
## i4  -0.06 -0.48 -0.07  1.00 -0.12  0.63 -0.12 -0.45 -0.11  0.53 -0.15
## i5   0.17  0.18  0.30 -0.12  1.00 -0.04  0.18  0.31  0.26 -0.13  0.33
## i6  -0.23 -0.52 -0.11  0.63 -0.04  1.00 -0.04 -0.47 -0.18  0.58 -0.04
## i7   0.18  0.21  0.13 -0.12  0.18 -0.04  1.00  0.18  0.17 -0.15  0.13
## i8   0.14  0.57  0.25 -0.45  0.31 -0.47  0.18  1.00  0.23 -0.57  0.18
## i9   0.43  0.18  0.34 -0.11  0.26 -0.18  0.17  0.23  1.00 -0.19  0.26
## i10 -0.08 -0.61 -0.25  0.53 -0.13  0.58 -0.15 -0.57 -0.19  1.00 -0.17
## i11  0.30  0.12  0.33 -0.15  0.33 -0.04  0.13  0.18  0.26 -0.17  1.00
## i12 -0.11 -0.60 -0.05  0.54 -0.11  0.45 -0.03 -0.53 -0.13  0.48 -0.17
##       i12
## i1  -0.11
## i2  -0.60
## i3  -0.05
## i4   0.54
## i5  -0.11
## i6   0.45
## i7  -0.03
## i8  -0.53
## i9  -0.13
## i10  0.48
## i11 -0.17
## i12  1.00

# Ask Bartlett test whether correlation matrix is idendity matrix (all elements are 0), should be significant
cortest.bartlett(ddf)

## R was not square, finding R from data

## $chisq
## [1] 374.6
## 
## $p.value
## [1] 1.073e-44
## 
## $df
## [1] 66

# alternativly on correlation matrix (add n) 
cortest.bartlett(ddf, n=nrow(ddf))

## R was not square, finding R from data

## $chisq
## [1] 374.6
## 
## $p.value
## [1] 1.073e-44
## 
## $df
## [1] 66

# get kmo: are data good for factorization
psych::KMO(ddf)

## Kaiser-Meyer-Olkin factor adequacy
## Call: psych::KMO(r = ddf)
## Overall MSA =  0.81
## MSA for each item = 
##   i1   i2   i3   i4   i5   i6   i7   i8   i9  i10  i11  i12 
## 0.64 0.86 0.77 0.82 0.74 0.78 0.67 0.88 0.83 0.84 0.74 0.82

# determinant ('area' of data) should be higher than 0.00001, singularity might wait ...
det(cor(ddf))

## [1] 0.01871

# number of factors suggested by fa.parallel() is 2
items.parallel <- fa.parallel(ddf, fa="fa")

## Loading required package: parallel
## Loading required package: MASS

plot of chunk unnamed-chunk-2

## Parallel analysis suggests that the number of factors =  2  and the number of components =  2

# we go on with 2 factors
# we start with an unrotated solution using ML (maximum likelihood). SMC is inserted as estimates for communality.
m.ml.u <- fa(ddf, 
    nfactors=2,
    n.obs = nrow(ddf),
  SMC=TRUE,
  fm="ml",
  rotate="none",
  max.iter=100
  )

# get the quality
print(m.ml.u)

## Factor Analysis using method =  ml
## Call: fa(r = ddf, nfactors = 2, n.obs = nrow(ddf), rotate = "none", 
##     SMC = TRUE, max.iter = 100, fm = "ml")
## Standardized loadings (pattern matrix) based upon correlation matrix
##       ML1   ML2    h2   u2 com
## i1   0.26  0.50 0.315 0.68 1.5
## i2   0.76 -0.08 0.587 0.41 1.0
## i3   0.28  0.55 0.379 0.62 1.5
## i4  -0.69  0.21 0.522 0.48 1.2
## i5   0.27  0.41 0.241 0.76 1.7
## i6  -0.71  0.17 0.531 0.47 1.1
## i7   0.21  0.22 0.091 0.91 2.0
## i8   0.73  0.04 0.531 0.47 1.0
## i9   0.31  0.49 0.340 0.66 1.7
## i10 -0.77  0.08 0.594 0.41 1.0
## i11  0.26  0.45 0.268 0.73 1.6
## i12 -0.68  0.17 0.497 0.50 1.1
## 
##                        ML1  ML2
## SS loadings           3.57 1.33
## Proportion Var        0.30 0.11
## Cumulative Var        0.30 0.41
## Proportion Explained  0.73 0.27
## Cumulative Proportion 0.73 1.00
## 
## Mean item complexity =  1.4
## Test of the hypothesis that 2 factors are sufficient.
## 
## The degrees of freedom for the null model are  66  and the objective function was  3.98 with Chi Square of  374.6
## The degrees of freedom for the model are 43  and the objective function was  0.55 
## 
## The root mean square of the residuals (RMSR) is  0.05 
## The df corrected root mean square of the residuals is  0.06 
## 
## The harmonic number of observations is  100 with the empirical chi square  32.4  with prob <  0.88 
## The total number of observations was  100  with MLE Chi Square =  50.99  with prob <  0.19 
## 
## Tucker Lewis Index of factoring reliability =  0.96
## RMSEA index =  0.052  and the 90 % confidence intervals are  NA 0.084
## BIC =  -147
## Fit based upon off diagonal values = 0.97
## Measures of factor score adequacy             
##                                                 ML1  ML2
## Correlation of scores with factors             0.94 0.82
## Multiple R square of scores with factors       0.88 0.67
## Minimum correlation of possible factor scores  0.77 0.34

# take a closer look at factor loadings, which gives us the structure
print(m.ml.u$loadings)

## 
## Loadings:
##     ML1    ML2   
## i1   0.263  0.496
## i2   0.762       
## i3   0.284  0.547
## i4  -0.692  0.210
## i5   0.267  0.412
## i6  -0.709  0.168
## i7   0.206  0.221
## i8   0.728       
## i9   0.312  0.492
## i10 -0.767       
## i11  0.257  0.449
## i12 -0.684  0.172
## 
##                  ML1   ML2
## SS loadings    3.573 1.325
## Proportion Var 0.298 0.110
## Cumulative Var 0.298 0.408

# we might see the structure clearer by suppressing the output of low loadings by modifying cutoff, default is .1
print(m.ml.u$loadings, cutoff=.4)

## 
## Loadings:
##     ML1    ML2   
## i1          0.496
## i2   0.762       
## i3          0.547
## i4  -0.692       
## i5          0.412
## i6  -0.709       
## i7               
## i8   0.728       
## i9          0.492
## i10 -0.767       
## i11         0.449
## i12 -0.684       
## 
##                  ML1   ML2
## SS loadings    3.573 1.325
## Proportion Var 0.298 0.110
## Cumulative Var 0.298 0.408

# at least item i7 seems to stay unclear
# maybe rotation helps
# we do the same model with orthogonal rotation varimax
m.ml.r <- fa(ddf, 
    nfactors=2,
    n.obs = nrow(ddf),
  SMC=TRUE,
  fm="ml",
  rotate="varimax",
  max.iter=100
  )

## Loading required package: GPArotation

print(m.ml.r)

## Factor Analysis using method =  ml
## Call: fa(r = ddf, nfactors = 2, n.obs = nrow(ddf), rotate = "varimax", 
##     SMC = TRUE, max.iter = 100, fm = "ml")
## Standardized loadings (pattern matrix) based upon correlation matrix
##       ML1   ML2    h2   u2 com
## i1  -0.07  0.56 0.315 0.68 1.0
## i2  -0.74  0.20 0.587 0.41 1.1
## i3  -0.07  0.61 0.379 0.62 1.0
## i4   0.72 -0.05 0.522 0.48 1.0
## i5  -0.10  0.48 0.241 0.76 1.1
## i6   0.72 -0.10 0.531 0.47 1.0
## i7  -0.11  0.28 0.091 0.91 1.3
## i8  -0.66  0.30 0.531 0.47 1.4
## i9  -0.11  0.57 0.340 0.66 1.1
## i10  0.74 -0.20 0.594 0.41 1.1
## i11 -0.08  0.51 0.268 0.73 1.0
## i12  0.70 -0.08 0.497 0.50 1.0
## 
##                        ML1  ML2
## SS loadings           3.12 1.77
## Proportion Var        0.26 0.15
## Cumulative Var        0.26 0.41
## Proportion Explained  0.64 0.36
## Cumulative Proportion 0.64 1.00
## 
## Mean item complexity =  1.1
## Test of the hypothesis that 2 factors are sufficient.
## 
## The degrees of freedom for the null model are  66  and the objective function was  3.98 with Chi Square of  374.6
## The degrees of freedom for the model are 43  and the objective function was  0.55 
## 
## The root mean square of the residuals (RMSR) is  0.05 
## The df corrected root mean square of the residuals is  0.06 
## 
## The harmonic number of observations is  100 with the empirical chi square  32.4  with prob <  0.88 
## The total number of observations was  100  with MLE Chi Square =  50.99  with prob <  0.19 
## 
## Tucker Lewis Index of factoring reliability =  0.96
## RMSEA index =  0.052  and the 90 % confidence intervals are  NA 0.084
## BIC =  -147
## Fit based upon off diagonal values = 0.97
## Measures of factor score adequacy             
##                                                 ML1  ML2
## Correlation of scores with factors             0.93 0.83
## Multiple R square of scores with factors       0.86 0.70
## Minimum correlation of possible factor scores  0.71 0.39

# we might see the structure clearer by suppressing the output of low loadings by modifying cutoff, default is .1
print(m.ml.r$loadings, cutoff=.2)

## 
## Loadings:
##     ML1    ML2   
## i1          0.557
## i2  -0.740       
## i3          0.612
## i4   0.721       
## i5          0.481
## i6   0.722       
## i7          0.280
## i8  -0.663  0.302
## i9          0.572
## i10  0.744 -0.201
## i11         0.511
## i12  0.700       
## 
##                  ML1   ML2
## SS loadings    3.125 1.773
## Proportion Var 0.260 0.148
## Cumulative Var 0.260 0.408

# take a look at the factor scores of the first few observations
head(round(m.ml.r$scores, 2))

##        ML1   ML2
## [1,] -0.01  0.74
## [2,] -0.58  0.83
## [3,]  0.43  1.17
## [4,]  0.07 -1.39
## [5,] -0.06  1.35
## [6,] -0.14 -0.11

# we might want to continue using the factor scores and therefore add them to dataframe with verbose name
colnames(m.ml.r$scores) <-  c('e.unclear.r', 'e.attention')
ddf <- cbind(ddf, m.ml.r$scores)

# descriptives of factor scores might be of interest
psych::describe(m.ml.r$scores)

##             vars   n mean   sd median trimmed  mad   min  max range  skew
## e.unclear.r    1 100    0 0.93  -0.23   -0.10 0.85 -1.29 2.37  3.65  0.83
## e.attention    2 100    0 0.83   0.05    0.03 0.95 -2.17 1.57  3.75 -0.30
##             kurtosis   se
## e.unclear.r    -0.20 0.09
## e.attention    -0.58 0.08

# we might be interested in the person with lowest emotionional clearness
ddf[m.ml.r$scores[,1] == min(m.ml.r$scores[,1]),]

##    i1 i2 i3 i4 i5 i6 i7 i8 i9 i10 i11 i12 e.unclear.r e.attention
## 30  2  4  3  1  2  1  4  4  1   1   3   1      -1.288     -0.9508

# its subject 10 who has extreme answers in almost all questions

# finally we might want to visualize the model
# we can use the usual plot() method
plot(m.ml.r)

plot of chunk unnamed-chunk-2

# or the fa.diagram() of library(psych), parameter cut is equivalent to looking at scores, parameter simple suppresses inclusion of cross-loadings
fa.diagram(m.ml.r, simple=TRUE, cut=.2, digits=2)

plot of chunk unnamed-chunk-2

Alle Indices zeigen die prinzipielle Eignung des Datensatzes für eine Faktorisierung. - Der Bartlett-Test ob die Korrelationsmatrix von einer Idenditätsmatrix verschieden ist, wird signifikant. - Der globale KMO-Wert für die Eignung der Daten zur Faktorisierung (overall MSA) liegt mit .81 ‘im guten Bereich’. - Jeder einzelne MSA-Wert der Variablen liegt über .5, ein Großteil ebenfalls im ‘guten Bereich’. - Auch die Determinante zeigt keinerlei Hinweis auf Singularität.

Die Parallelanalyse schlägt eine Lösung mit 2 Faktoren als optimal vor. Das deckt sich mit den konzeptionellen Überlegungen, die in die Itemformulierungen eingegangen sind.

Eine explorative Maximum Likelihood Faktorisierung mit 2 Faktoren kann insgesamt 41% der Varianz aufklären. Der erste Faktor bindet 26%, der zweite nochmals 15% der Originalvarianz.

Die unrotierte EFA mit 2 Faktoren als Vorgabe zeigt vor allem für Item i7 keine klare Zuordnung. Nach Rotation werden alle Ladungskoeffizienten höher im Vergleich zur unrotierten Lösung. Allerdings bleibt der Ladungskoeffizient für Item i7 vergleichsweise niedrig. Trotzdem passt er tendenziell zur inhaltlichen Vorgabe.

Somit zeigt sich eine klare Faktorstruktur in der rotierten Faktorladungs-Matrix. Jedes zweite Item gehört zu je einem Faktor. Inhaltlich können die Items klar den beiden intendierten Dimensionen eSA (emotional Self Attention, Emotionale Selbstaufmerksamkeit) und eC (emotional Clearness, Klarheit über eigene Gefühle) zugeordnet werden. Auch die Vorzeichen stimmen mit der Item-Formulierung überein.

Zum com Feld in der Ausgabe bei Pattern Matrix:

Laut dem Autor des Pakets William Revelle : The ‘com’ column is factor complexity using the index developed by Hofmann (1978). It is a row wise measure of item. complexity.

bzw. aus der Hilfe:

complexity
Hoffman’s index of complexity for each item. This is just \[ \frac{(Σ a_i^2)^2}{Σ a_i^4} \] where a_i is the factor loading on the ith factor. From Hofmann (1978), MBR. See also Pettersson and Turkheimer (2010).

und aus dem Hofman Artikel:

The complexity index is a positive number indicating on the average how many factors are used to explain each variable in a factor solution.

oblique Rotation

ddf <- read.delim("http://r.psych.bio.uni-goettingen.de/mv/data/virt/v_ei.txt")
require(psych)

# fit orthogonal and oblique model
m.ml.orth <- fa(ddf, 
    nfactors=2,
    n.obs = nrow(ddf),
  SMC=TRUE,
  fm="ml",
  rotate="varimax",
  max.iter=100
  )
m.ml.obl <- fa(ddf, 
    nfactors=2,
    n.obs = nrow(ddf),
  SMC=TRUE,
  fm="ml",
  rotate="oblimin",
  max.iter=100
  )

# get the quality
print(m.ml.orth)

## Factor Analysis using method =  ml
## Call: fa(r = ddf, nfactors = 2, n.obs = nrow(ddf), rotate = "varimax", 
##     SMC = TRUE, max.iter = 100, fm = "ml")
## Standardized loadings (pattern matrix) based upon correlation matrix
##       ML1   ML2    h2   u2 com
## i1  -0.07  0.56 0.315 0.68 1.0
## i2  -0.74  0.20 0.587 0.41 1.1
## i3  -0.07  0.61 0.379 0.62 1.0
## i4   0.72 -0.05 0.522 0.48 1.0
## i5  -0.10  0.48 0.241 0.76 1.1
## i6   0.72 -0.10 0.531 0.47 1.0
## i7  -0.11  0.28 0.091 0.91 1.3
## i8  -0.66  0.30 0.531 0.47 1.4
## i9  -0.11  0.57 0.340 0.66 1.1
## i10  0.74 -0.20 0.594 0.41 1.1
## i11 -0.08  0.51 0.268 0.73 1.0
## i12  0.70 -0.08 0.497 0.50 1.0
## 
##                        ML1  ML2
## SS loadings           3.12 1.77
## Proportion Var        0.26 0.15
## Cumulative Var        0.26 0.41
## Proportion Explained  0.64 0.36
## Cumulative Proportion 0.64 1.00
## 
## Mean item complexity =  1.1
## Test of the hypothesis that 2 factors are sufficient.
## 
## The degrees of freedom for the null model are  66  and the objective function was  3.98 with Chi Square of  374.6
## The degrees of freedom for the model are 43  and the objective function was  0.55 
## 
## The root mean square of the residuals (RMSR) is  0.05 
## The df corrected root mean square of the residuals is  0.06 
## 
## The harmonic number of observations is  100 with the empirical chi square  32.4  with prob <  0.88 
## The total number of observations was  100  with MLE Chi Square =  50.99  with prob <  0.19 
## 
## Tucker Lewis Index of factoring reliability =  0.96
## RMSEA index =  0.052  and the 90 % confidence intervals are  NA 0.084
## BIC =  -147
## Fit based upon off diagonal values = 0.97
## Measures of factor score adequacy             
##                                                 ML1  ML2
## Correlation of scores with factors             0.93 0.83
## Multiple R square of scores with factors       0.86 0.70
## Minimum correlation of possible factor scores  0.71 0.39

print(m.ml.obl)

## Factor Analysis using method =  ml
## Call: fa(r = ddf, nfactors = 2, n.obs = nrow(ddf), rotate = "oblimin", 
##     SMC = TRUE, max.iter = 100, fm = "ml")
## Standardized loadings (pattern matrix) based upon correlation matrix
##       ML1   ML2    h2   u2 com
## i1   0.01  0.57 0.315 0.68 1.0
## i2  -0.75  0.05 0.587 0.41 1.0
## i3   0.02  0.62 0.379 0.62 1.0
## i4   0.75  0.10 0.522 0.48 1.0
## i5  -0.03  0.48 0.241 0.76 1.0
## i6   0.74  0.05 0.531 0.47 1.0
## i7  -0.08  0.27 0.091 0.91 1.2
## i8  -0.65  0.18 0.531 0.47 1.1
## i9  -0.04  0.57 0.340 0.66 1.0
## i10  0.75 -0.05 0.594 0.41 1.0
## i11 -0.01  0.52 0.268 0.73 1.0
## i12  0.72  0.06 0.497 0.50 1.0
## 
##                        ML1  ML2
## SS loadings           3.22 1.68
## Proportion Var        0.27 0.14
## Cumulative Var        0.27 0.41
## Proportion Explained  0.66 0.34
## Cumulative Proportion 0.66 1.00
## 
##  With factor correlations of 
##       ML1   ML2
## ML1  1.00 -0.33
## ML2 -0.33  1.00
## 
## Mean item complexity =  1
## Test of the hypothesis that 2 factors are sufficient.
## 
## The degrees of freedom for the null model are  66  and the objective function was  3.98 with Chi Square of  374.6
## The degrees of freedom for the model are 43  and the objective function was  0.55 
## 
## The root mean square of the residuals (RMSR) is  0.05 
## The df corrected root mean square of the residuals is  0.06 
## 
## The harmonic number of observations is  100 with the empirical chi square  32.4  with prob <  0.88 
## The total number of observations was  100  with MLE Chi Square =  50.99  with prob <  0.19 
## 
## Tucker Lewis Index of factoring reliability =  0.96
## RMSEA index =  0.052  and the 90 % confidence intervals are  NA 0.084
## BIC =  -147
## Fit based upon off diagonal values = 0.97
## Measures of factor score adequacy             
##                                                 ML1  ML2
## Correlation of scores with factors             0.94 0.85
## Multiple R square of scores with factors       0.88 0.72
## Minimum correlation of possible factor scores  0.76 0.44

# we might see the structure clearer by suppressing the output of low loadings by modifying cutoff, default is .1
print(m.ml.orth$loadings, cutoff=.2)

## 
## Loadings:
##     ML1    ML2   
## i1          0.557
## i2  -0.740       
## i3          0.612
## i4   0.721       
## i5          0.481
## i6   0.722       
## i7          0.280
## i8  -0.663  0.302
## i9          0.572
## i10  0.744 -0.201
## i11         0.511
## i12  0.700       
## 
##                  ML1   ML2
## SS loadings    3.125 1.773
## Proportion Var 0.260 0.148
## Cumulative Var 0.260 0.408

print(m.ml.obl$loadings, cutoff=.2)

## 
## Loadings:
##     ML1    ML2   
## i1          0.566
## i2  -0.748       
## i3          0.622
## i4   0.749       
## i5          0.479
## i6   0.744       
## i7          0.268
## i8  -0.652       
## i9          0.570
## i10  0.751       
## i11         0.515
## i12  0.722       
## 
##                  ML1   ML2
## SS loadings    3.194 1.650
## Proportion Var 0.266 0.137
## Cumulative Var 0.266 0.404

Die Auswirkungen auf die Ladungsmatrix bei orthogonaler bzw. obliquer Rotation, also deren Unterschiede, sind marginal.

Die Behandlung der EI Daten als Basis einer PCA findet sich unter PCA Rmd html

Die Behandlung der EI Daten als Basis einer CFA findet sich unter CFA. Rmd html

Beispiel Werner

Quelle

Generierte Werte von 15 Items und für 100 Beobachtungen sollen faktorisiert werden.

# we need library(psych)
require("psych")
items <- read.delim(file="http://r.psych.bio.uni-goettingen.de/mv/data/div/werner-fa.txt")
# take a look at the data
head(round(items, 2))

##      V1    V2    V3    V4    V5    V6    V7    V8    V9  V10   V11   V12
## 1  2.03 -1.18 -0.29 -1.01 -0.31  0.36  0.42  2.07  0.90 0.56 -0.99 -0.74
## 2  0.34  0.51  1.13 -0.73  0.13 -0.25 -0.79  1.02  0.51 0.18 -0.83 -0.50
## 3  1.16  1.77  1.49  1.20  0.30  1.03  1.18  0.45  0.58 0.20 -1.80  0.33
## 4  0.08  0.50  0.78  2.43 -0.34 -1.34 -1.19 -0.96  1.34 0.07 -0.13  0.73
## 5 -0.26  1.15  1.46 -0.11  0.23 -0.56 -0.53 -0.47  0.06 0.31 -0.85 -0.83
## 6 -0.38  0.72 -0.07  0.77  1.09 -0.59  0.99 -2.13 -2.53 0.06  1.54 -0.16
##     V13   V14   V15
## 1 -0.68 -1.68 -0.10
## 2  1.09  1.14  0.10
## 3 -0.49  0.22  1.16
## 4  0.31 -0.49 -0.74
## 5 -1.83 -0.41 -0.53
## 6  0.69  0.88 -0.48

# descriptives of the items to factorize
#summary(items)
psych::describe(items)

##     vars   n  mean   sd median trimmed  mad   min  max range  skew
## V1     1 100 -0.04 1.02  -0.03   -0.03 0.82 -2.81 2.40  5.21 -0.05
## V2     2 100 -0.02 1.02  -0.10   -0.06 1.04 -2.31 2.40  4.71  0.22
## V3     3 100 -0.21 0.94  -0.29   -0.23 1.07 -2.68 1.99  4.67  0.09
## V4     4 100 -0.16 1.07  -0.12   -0.13 1.00 -3.30 2.43  5.73 -0.30
## V5     5 100  0.04 1.04   0.03    0.05 1.02 -2.58 2.29  4.87 -0.09
## V6     6 100 -0.15 1.03  -0.22   -0.15 1.12 -2.76 2.13  4.89 -0.01
## V7     7 100 -0.21 1.00  -0.11   -0.21 0.86 -3.12 2.09  5.21 -0.07
## V8     8 100 -0.08 1.01  -0.04   -0.04 1.08 -3.09 2.82  5.91 -0.21
## V9     9 100  0.02 0.99   0.08    0.05 0.97 -2.53 1.89  4.42 -0.35
## V10   10 100  0.00 1.00   0.16    0.05 0.83 -2.85 2.13  4.98 -0.49
## V11   11 100 -0.25 1.01  -0.23   -0.26 1.03 -2.87 2.72  5.59  0.13
## V12   12 100  0.03 0.98   0.00    0.02 0.96 -2.29 2.17  4.47  0.07
## V13   13 100 -0.24 1.03  -0.28   -0.20 1.15 -2.93 1.78  4.71 -0.26
## V14   14 100  0.05 1.07   0.05    0.07 0.97 -2.60 2.74  5.34 -0.15
## V15   15 100 -0.10 0.88  -0.21   -0.17 0.81 -1.87 2.40  4.27  0.66
##     kurtosis   se
## V1      0.23 0.10
## V2     -0.41 0.10
## V3     -0.56 0.09
## V4      0.22 0.11
## V5     -0.53 0.10
## V6     -0.30 0.10
## V7     -0.15 0.10
## V8      0.19 0.10
## V9     -0.14 0.10
## V10     0.02 0.10
## V11     0.09 0.10
## V12    -0.66 0.10
## V13    -0.64 0.10
## V14    -0.32 0.11
## V15     0.29 0.09

# look at the scatterplots for potential problems in raw data
pairs.panels(items, pch=".")

plot of chunk unnamed-chunk-4

pairs.panels(items[ , 1:5], pch=".")

plot of chunk unnamed-chunk-4

pairs.panels(items[ , 6:10], pch=".")

plot of chunk unnamed-chunk-4

pairs.panels(items[ , 11:15], pch=".")

plot of chunk unnamed-chunk-4

# number of factors to extract: via scree plot
# library(psych) offers VSS.scree()
VSS.scree(items)

plot of chunk unnamed-chunk-4

# Um einen Vergleich von FA und PCA in einem Plot zu haben
# als Hilfe für die Anzahl der zu extrahierenden Faktoren
# let us see what is recommended by fa.parallel()
items.parallel <- fa.parallel(items, fa="fa")

plot of chunk unnamed-chunk-4

## Parallel analysis suggests that the number of factors =  3  and the number of components =  3

# es werden 3 Faktoren vorgeschlagen, mit denen rechnen wir weiter

# Korrelationsmatrix und reduzierte Korrelationsmatrix

# Startmatrix für die PCA
cor(items)

##            V1       V2      V3      V4     V5      V6      V7       V8
## V1  1.0000000  0.31843 0.25500 0.32260 0.2996 0.12924 0.20662  0.18216
## V2  0.3184283  1.00000 0.41457 0.25992 0.3390 0.03132 0.03207  0.02639
## V3  0.2550011  0.41457 1.00000 0.28250 0.2932 0.12790 0.12630  0.06694
## V4  0.3225983  0.25992 0.28250 1.00000 0.2885 0.14327 0.19492  0.04854
## V5  0.2996313  0.33898 0.29321 0.28851 1.0000 0.14667 0.27445  0.15326
## V6  0.1292391  0.03132 0.12790 0.14327 0.1467 1.00000 0.22218  0.32045
## V7  0.2066230  0.03207 0.12630 0.19492 0.2744 0.22218 1.00000  0.15852
## V8  0.1821564  0.02639 0.06694 0.04854 0.1533 0.32045 0.15852  1.00000
## V9  0.2393177  0.14674 0.25989 0.21092 0.3845 0.28473 0.28394  0.34404
## V10 0.1170624  0.15664 0.11622 0.23747 0.3397 0.35551 0.31351  0.21196
## V11 0.0002381 -0.04229 0.08224 0.12705 0.2169 0.02868 0.18313  0.08516
## V12 0.0116322  0.16923 0.25811 0.31643 0.2882 0.27597 0.01333  0.11642
## V13 0.0194266  0.03072 0.03469 0.22703 0.0993 0.07479 0.02197 -0.01260
## V14 0.0876677  0.10243 0.23506 0.18642 0.1402 0.17200 0.23076  0.13954
## V15 0.1861142  0.15367 0.11661 0.14729 0.2200 0.18085 0.09933  0.07924
##         V9      V10        V11     V12      V13     V14     V15
## V1  0.2393  0.11706  0.0002381 0.01163  0.01943 0.08767 0.18611
## V2  0.1467  0.15664 -0.0422883 0.16923  0.03072 0.10243 0.15367
## V3  0.2599  0.11622  0.0822444 0.25811  0.03469 0.23506 0.11661
## V4  0.2109  0.23747  0.1270486 0.31643  0.22703 0.18642 0.14729
## V5  0.3845  0.33968  0.2169315 0.28825  0.09930 0.14022 0.22000
## V6  0.2847  0.35551  0.0286842 0.27597  0.07479 0.17200 0.18085
## V7  0.2839  0.31351  0.1831269 0.01333  0.02197 0.23076 0.09933
## V8  0.3440  0.21196  0.0851590 0.11642 -0.01260 0.13954 0.07924
## V9  1.0000  0.36259  0.1256850 0.19183  0.12846 0.10054 0.22401
## V10 0.3626  1.00000 -0.0971743 0.19084  0.04566 0.21296 0.12135
## V11 0.1257 -0.09717  1.0000000 0.19733  0.31050 0.17359 0.28269
## V12 0.1918  0.19084  0.1973278 1.00000  0.20028 0.28097 0.37675
## V13 0.1285  0.04566  0.3104995 0.20028  1.00000 0.29482 0.22747
## V14 0.1005  0.21296  0.1735942 0.28097  0.29482 1.00000 0.02152
## V15 0.2240  0.12135  0.2826914 0.37675  0.22747 0.02152 1.00000

# reduzierte Korrelationsmatrix für PF (principle factors)
# Das psych-Paket stellt mit der smc() Funktion eine Möglichkeit zur Verfügung
# die quadrierten multiplen Korrelationen der Variable mit den Restvariablen zu erhalten
# diese kommen bei der reduzierten Korrelationsmatrix in die Diagonale
# als Schätzwerte für die Kommunalitäten

items.cors.reduced <- cor(items)
diag(items.cors.reduced) <- smc(items)



# Nun eine FA mit 3 Faktoren, Multiplen Korrelationen als Kommunalitäten-Schätzer, 
# Hauptachsen-FA, Iterationsschritte auf 100 erhöht
# Promax-Rotation lässt Korrelationen zwischen den zu findenden Faktoren zu.

items.pa.promax <- fa(items,
                      nfactors=3,
                      SMC=TRUE,
                      fm="pa",
                      rotate="promax",
                      max.iter=100
                      )

# Was ergibt sich?
print(items.pa.promax)

## Factor Analysis using method =  pa
## Call: fa(r = items, nfactors = 3, rotate = "promax", SMC = TRUE, max.iter = 100, 
##     fm = "pa")
## Standardized loadings (pattern matrix) based upon correlation matrix
##       PA3   PA1   PA2   h2   u2 com
## V1   0.17  0.45 -0.13 0.26 0.74 1.5
## V2  -0.17  0.84 -0.13 0.52 0.48 1.1
## V3  -0.01  0.56  0.04 0.33 0.67 1.0
## V4   0.06  0.36  0.21 0.29 0.71 1.7
## V5   0.26  0.36  0.10 0.38 0.62 2.0
## V6   0.62 -0.18  0.03 0.31 0.69 1.2
## V7   0.48 -0.04  0.01 0.21 0.79 1.0
## V8   0.57 -0.16 -0.07 0.23 0.77 1.2
## V9   0.58  0.05  0.01 0.38 0.62 1.0
## V10  0.68  0.01 -0.15 0.38 0.62 1.1
## V11 -0.10 -0.16  0.65 0.32 0.68 1.2
## V12  0.06  0.10  0.47 0.32 0.68 1.1
## V13 -0.13 -0.10  0.64 0.31 0.69 1.1
## V14  0.15  0.00  0.31 0.17 0.83 1.4
## V15  0.04  0.06  0.42 0.22 0.78 1.1
## 
##                        PA3  PA1  PA2
## SS loadings           1.82 1.49 1.33
## Proportion Var        0.12 0.10 0.09
## Cumulative Var        0.12 0.22 0.31
## Proportion Explained  0.39 0.32 0.29
## Cumulative Proportion 0.39 0.71 1.00
## 
##  With factor correlations of 
##      PA3  PA1  PA2
## PA3 1.00 0.55 0.50
## PA1 0.55 1.00 0.46
## PA2 0.50 0.46 1.00
## 
## Mean item complexity =  1.2
## Test of the hypothesis that 3 factors are sufficient.
## 
## The degrees of freedom for the null model are  105  and the objective function was  2.98 with Chi Square of  278.1
## The degrees of freedom for the model are 63  and the objective function was  0.7 
## 
## The root mean square of the residuals (RMSR) is  0.06 
## The df corrected root mean square of the residuals is  0.07 
## 
## The harmonic number of observations is  100 with the empirical chi square  64.2  with prob <  0.43 
## The total number of observations was  100  with MLE Chi Square =  63.95  with prob <  0.44 
## 
## Tucker Lewis Index of factoring reliability =  0.991
## RMSEA index =  0.032  and the 90 % confidence intervals are  NA 0.062
## BIC =  -226.2
## Fit based upon off diagonal values = 0.93
## Measures of factor score adequacy             
##                                                 PA3  PA1  PA2
## Correlation of scores with factors             0.87 0.86 0.83
## Multiple R square of scores with factors       0.76 0.75 0.69
## Minimum correlation of possible factor scores  0.51 0.50 0.39

print(items.pa.promax)

## Factor Analysis using method =  pa
## Call: fa(r = items, nfactors = 3, rotate = "promax", SMC = TRUE, max.iter = 100, 
##     fm = "pa")
## Standardized loadings (pattern matrix) based upon correlation matrix
##       PA3   PA1   PA2   h2   u2 com
## V1   0.17  0.45 -0.13 0.26 0.74 1.5
## V2  -0.17  0.84 -0.13 0.52 0.48 1.1
## V3  -0.01  0.56  0.04 0.33 0.67 1.0
## V4   0.06  0.36  0.21 0.29 0.71 1.7
## V5   0.26  0.36  0.10 0.38 0.62 2.0
## V6   0.62 -0.18  0.03 0.31 0.69 1.2
## V7   0.48 -0.04  0.01 0.21 0.79 1.0
## V8   0.57 -0.16 -0.07 0.23 0.77 1.2
## V9   0.58  0.05  0.01 0.38 0.62 1.0
## V10  0.68  0.01 -0.15 0.38 0.62 1.1
## V11 -0.10 -0.16  0.65 0.32 0.68 1.2
## V12  0.06  0.10  0.47 0.32 0.68 1.1
## V13 -0.13 -0.10  0.64 0.31 0.69 1.1
## V14  0.15  0.00  0.31 0.17 0.83 1.4
## V15  0.04  0.06  0.42 0.22 0.78 1.1
## 
##                        PA3  PA1  PA2
## SS loadings           1.82 1.49 1.33
## Proportion Var        0.12 0.10 0.09
## Cumulative Var        0.12 0.22 0.31
## Proportion Explained  0.39 0.32 0.29
## Cumulative Proportion 0.39 0.71 1.00
## 
##  With factor correlations of 
##      PA3  PA1  PA2
## PA3 1.00 0.55 0.50
## PA1 0.55 1.00 0.46
## PA2 0.50 0.46 1.00
## 
## Mean item complexity =  1.2
## Test of the hypothesis that 3 factors are sufficient.
## 
## The degrees of freedom for the null model are  105  and the objective function was  2.98 with Chi Square of  278.1
## The degrees of freedom for the model are 63  and the objective function was  0.7 
## 
## The root mean square of the residuals (RMSR) is  0.06 
## The df corrected root mean square of the residuals is  0.07 
## 
## The harmonic number of observations is  100 with the empirical chi square  64.2  with prob <  0.43 
## The total number of observations was  100  with MLE Chi Square =  63.95  with prob <  0.44 
## 
## Tucker Lewis Index of factoring reliability =  0.991
## RMSEA index =  0.032  and the 90 % confidence intervals are  NA 0.062
## BIC =  -226.2
## Fit based upon off diagonal values = 0.93
## Measures of factor score adequacy             
##                                                 PA3  PA1  PA2
## Correlation of scores with factors             0.87 0.86 0.83
## Multiple R square of scores with factors       0.76 0.75 0.69
## Minimum correlation of possible factor scores  0.51 0.50 0.39

# zunächst sieht man die Faktorladungen
# die Kommunalitäten (h^2) der Variablen in einer eigenen Spalte
# sowie die u^2, also der durch die Gesamtheit der Faktoren nicht erklärte Varianzanteil

# darunter folgen Angaben über die von den Faktoren über alle Variablen zusammen erklärte Varianz, 
# d. h. die Eigenwerte nach der Rotation (SS loadings; sum of squared loadings). 
# Außer den Absolutwerten sind auch die Varianzanteile (Proportion Var) 
# und kumulativen Varianzanteile (Cumulative Var) aufgeführt. 
# Hier werden also insgesamt 31% der Varianz aller Variablen durch die drei Faktoren zusammen erklärt
# Da hier bereits die Rotation erfolgt ist, sind die Eigenwerte nicht nach Größe geordnet 
# und entsprechen auch nicht den initialen Eigenwerten

# sollten die initialen Eigenwerte interessieren, die ja Basis des Scree-Plots sind, 
# kann man die erhalten durch

print(items.pa.promax$values)

##  [1]  2.92046  0.88174  0.83252  0.39009  0.32567  0.19757  0.15095
##  [8]  0.08026  0.03215 -0.02925 -0.12145 -0.13452 -0.21423 -0.28949
## [15] -0.38870

# da oblique rotiert wurde folgt die Interkorrelationsmatrix der Faktoren (hier 3) 
# Die Faktoren sind also untereinander deutlich korreliert

# Schließlich erfolgt ein χ2 -Test auf Abweichung zwischen Modell und Daten
# bei (noch) signifikantem Unterschied klären die Faktoren noch nicht genug Varianz auf
# und man kann z.B. noch einen Faktor hinzunehmen.
# (hier ist das nicht mehr nötig)


# Zusammenfassung promax-Lösung:

# Die promax-Lösung liefert eine klare, gut interpretierbare Struktur.
# Zur Erleichterung der Interpretation 
# kann man sich die Faktorladungen ab einer bestimmten Minimalhöhe ausgeben lassen
# z. B. alle Ladungen > 0.2
print(items.pa.promax$loadings, cutoff=.2)

## 
## Loadings:
##     PA3    PA1    PA2   
## V1          0.451       
## V2          0.843       
## V3          0.564       
## V4          0.361  0.213
## V5   0.263  0.364       
## V6   0.624              
## V7   0.483              
## V8   0.573              
## V9   0.583              
## V10  0.676              
## V11                0.653
## V12                0.470
## V13                0.639
## V14                0.314
## V15                0.415
## 
##                  PA3   PA1   PA2
## SS loadings    1.935 1.606 1.446
## Proportion Var 0.129 0.107 0.096
## Cumulative Var 0.129 0.236 0.332

# hier zeigen sich relativ wenige Kreuzladungen (Ladungen derselbem Variable in verschiedene Faktoren.



# Wie liegt der Fall mit einer orthogonalen Rotation 'varimax'.

items.pa.varimax <- fa(items,
                       nfactors=3,
                       SMC=TRUE,
                       fm="pa",
                       rotate="varimax",
                       max.iter=100
                       )

print(items.pa.varimax)

## Factor Analysis using method =  pa
## Call: fa(r = items, nfactors = 3, rotate = "varimax", SMC = TRUE, max.iter = 100, 
##     fm = "pa")
## Standardized loadings (pattern matrix) based upon correlation matrix
##       PA3   PA1  PA2   h2   u2 com
## V1   0.22  0.46 0.01 0.26 0.74 1.4
## V2  -0.02  0.72 0.01 0.52 0.48 1.0
## V3   0.11  0.55 0.15 0.33 0.67 1.2
## V4   0.18  0.42 0.29 0.29 0.71 2.2
## V5   0.34  0.46 0.23 0.38 0.62 2.4
## V6   0.54  0.04 0.14 0.31 0.69 1.1
## V7   0.43  0.12 0.11 0.21 0.79 1.3
## V8   0.48  0.02 0.04 0.23 0.77 1.0
## V9   0.55  0.25 0.16 0.38 0.62 1.6
## V10  0.59  0.20 0.02 0.38 0.62 1.2
## V11  0.03 -0.03 0.56 0.32 0.68 1.0
## V12  0.18  0.23 0.48 0.32 0.68 1.7
## V13  0.00  0.01 0.55 0.31 0.69 1.0
## V14  0.21  0.13 0.33 0.17 0.83 2.0
## V15  0.14  0.17 0.42 0.22 0.78 1.6
## 
##                        PA3  PA1  PA2
## SS loadings           1.65 1.62 1.36
## Proportion Var        0.11 0.11 0.09
## Cumulative Var        0.11 0.22 0.31
## Proportion Explained  0.36 0.35 0.29
## Cumulative Proportion 0.36 0.71 1.00
## 
## Mean item complexity =  1.5
## Test of the hypothesis that 3 factors are sufficient.
## 
## The degrees of freedom for the null model are  105  and the objective function was  2.98 with Chi Square of  278.1
## The degrees of freedom for the model are 63  and the objective function was  0.7 
## 
## The root mean square of the residuals (RMSR) is  0.06 
## The df corrected root mean square of the residuals is  0.07 
## 
## The harmonic number of observations is  100 with the empirical chi square  64.2  with prob <  0.43 
## The total number of observations was  100  with MLE Chi Square =  63.95  with prob <  0.44 
## 
## Tucker Lewis Index of factoring reliability =  0.991
## RMSEA index =  0.032  and the 90 % confidence intervals are  NA 0.062
## BIC =  -226.2
## Fit based upon off diagonal values = 0.93
## Measures of factor score adequacy             
##                                                 PA3  PA1  PA2
## Correlation of scores with factors             0.81 0.83 0.78
## Multiple R square of scores with factors       0.66 0.68 0.62
## Minimum correlation of possible factor scores  0.31 0.37 0.23

# Ein Vergleich der beiden Lösungen mit einem Cut-Off-Wert von 0.1

print(items.pa.promax$loadings, cutoff=.1)

## 
## Loadings:
##     PA3    PA1    PA2   
## V1   0.169  0.451 -0.132
## V2  -0.173  0.843 -0.130
## V3          0.564       
## V4          0.361  0.213
## V5   0.263  0.364  0.101
## V6   0.624 -0.185       
## V7   0.483              
## V8   0.573 -0.157       
## V9   0.583              
## V10  0.676        -0.149
## V11        -0.156  0.653
## V12                0.470
## V13 -0.129         0.639
## V14  0.152         0.314
## V15                0.415
## 
##                  PA3   PA1   PA2
## SS loadings    1.935 1.606 1.446
## Proportion Var 0.129 0.107 0.096
## Cumulative Var 0.129 0.236 0.332

print(items.pa.varimax$loadings, cutoff=.1)

## 
## Loadings:
##     PA3    PA1    PA2   
## V1   0.217  0.457       
## V2          0.719       
## V3   0.111  0.546  0.151
## V4   0.176  0.417  0.289
## V5   0.336  0.460  0.232
## V6   0.540         0.142
## V7   0.434  0.119  0.112
## V8   0.477              
## V9   0.546  0.246  0.162
## V10  0.586  0.196       
## V11                0.564
## V12  0.183  0.227  0.480
## V13                0.554
## V14  0.209  0.129  0.334
## V15  0.144  0.172  0.416
## 
##                  PA3   PA1   PA2
## SS loadings    1.651 1.623 1.360
## Proportion Var 0.110 0.108 0.091
## Cumulative Var 0.110 0.218 0.309

# bei cutoff +/- .10 gibt es 13 Kreuzladungen bei promax Rotation (oblique)
# und 19 Ladungen bei varimax-Rotation


# Vielleicht ein wenig klarer mit einem Cut-Off-Wert von 0.2 wie oben
print(items.pa.promax$loadings, cutoff=.2)

## 
## Loadings:
##     PA3    PA1    PA2   
## V1          0.451       
## V2          0.843       
## V3          0.564       
## V4          0.361  0.213
## V5   0.263  0.364       
## V6   0.624              
## V7   0.483              
## V8   0.573              
## V9   0.583              
## V10  0.676              
## V11                0.653
## V12                0.470
## V13                0.639
## V14                0.314
## V15                0.415
## 
##                  PA3   PA1   PA2
## SS loadings    1.935 1.606 1.446
## Proportion Var 0.129 0.107 0.096
## Cumulative Var 0.129 0.236 0.332

print(items.pa.varimax$loadings, cutoff=.2)

## 
## Loadings:
##     PA3    PA1    PA2   
## V1   0.217  0.457       
## V2          0.719       
## V3          0.546       
## V4          0.417  0.289
## V5   0.336  0.460  0.232
## V6   0.540              
## V7   0.434              
## V8   0.477              
## V9   0.546  0.246       
## V10  0.586              
## V11                0.564
## V12         0.227  0.480
## V13                0.554
## V14  0.209         0.334
## V15                0.416
## 
##                  PA3   PA1   PA2
## SS loadings    1.651 1.623 1.360
## Proportion Var 0.110 0.108 0.091
## Cumulative Var 0.110 0.218 0.309

# jetzt liefert die promax-Rotation 2 Kreuzladungen und
# die varimax-Rotation 7

# in der promax-Lösung zeigen nur 3 Variablen Kreuzladungen,
# in der varimax-Lösung aber 8

# insgesamt zeigt die Lösung mit promax-Rotation die deutlich einfachere und klarere Struktur

# Als Möglichkeit einer grafischen Visualisierung der Struktur 
# bietet das psych-Paket die Funktion fa.diagram()
# simple=T ohne cross-loadings 

fa.diagram(items.pa.promax, simple=TRUE, cut=.2, digits=2)

plot of chunk unnamed-chunk-4

# oder simple=F mit cross-loadings

fa.diagram(items.pa.promax, simple=FALSE, cut=.2, digits=2)

plot of chunk unnamed-chunk-4

# Beispielhaft (Zur besseren Vergleichbarkeit mit anderen Paketen) eine FA in R: eine Hauptachsenanalyse mit varimax-Rotation.

items.pa.varimax <- fa(items,
                      nfactors=3,   # extract three factors 
                      SMC=TRUE,     # squared multiple correlations as estimates of communalities
                      fm="pa",      # Principal axes
                      rotate="varimax", # Varimax rotation (orthogonal)
                      max.iter=100
                      )

# und die Ergebnisse inspizieren
print(items.pa.promax)

## Factor Analysis using method =  pa
## Call: fa(r = items, nfactors = 3, rotate = "promax", SMC = TRUE, max.iter = 100, 
##     fm = "pa")
## Standardized loadings (pattern matrix) based upon correlation matrix
##       PA3   PA1   PA2   h2   u2 com
## V1   0.17  0.45 -0.13 0.26 0.74 1.5
## V2  -0.17  0.84 -0.13 0.52 0.48 1.1
## V3  -0.01  0.56  0.04 0.33 0.67 1.0
## V4   0.06  0.36  0.21 0.29 0.71 1.7
## V5   0.26  0.36  0.10 0.38 0.62 2.0
## V6   0.62 -0.18  0.03 0.31 0.69 1.2
## V7   0.48 -0.04  0.01 0.21 0.79 1.0
## V8   0.57 -0.16 -0.07 0.23 0.77 1.2
## V9   0.58  0.05  0.01 0.38 0.62 1.0
## V10  0.68  0.01 -0.15 0.38 0.62 1.1
## V11 -0.10 -0.16  0.65 0.32 0.68 1.2
## V12  0.06  0.10  0.47 0.32 0.68 1.1
## V13 -0.13 -0.10  0.64 0.31 0.69 1.1
## V14  0.15  0.00  0.31 0.17 0.83 1.4
## V15  0.04  0.06  0.42 0.22 0.78 1.1
## 
##                        PA3  PA1  PA2
## SS loadings           1.82 1.49 1.33
## Proportion Var        0.12 0.10 0.09
## Cumulative Var        0.12 0.22 0.31
## Proportion Explained  0.39 0.32 0.29
## Cumulative Proportion 0.39 0.71 1.00
## 
##  With factor correlations of 
##      PA3  PA1  PA2
## PA3 1.00 0.55 0.50
## PA1 0.55 1.00 0.46
## PA2 0.50 0.46 1.00
## 
## Mean item complexity =  1.2
## Test of the hypothesis that 3 factors are sufficient.
## 
## The degrees of freedom for the null model are  105  and the objective function was  2.98 with Chi Square of  278.1
## The degrees of freedom for the model are 63  and the objective function was  0.7 
## 
## The root mean square of the residuals (RMSR) is  0.06 
## The df corrected root mean square of the residuals is  0.07 
## 
## The harmonic number of observations is  100 with the empirical chi square  64.2  with prob <  0.43 
## The total number of observations was  100  with MLE Chi Square =  63.95  with prob <  0.44 
## 
## Tucker Lewis Index of factoring reliability =  0.991
## RMSEA index =  0.032  and the 90 % confidence intervals are  NA 0.062
## BIC =  -226.2
## Fit based upon off diagonal values = 0.93
## Measures of factor score adequacy             
##                                                 PA3  PA1  PA2
## Correlation of scores with factors             0.87 0.86 0.83
## Multiple R square of scores with factors       0.76 0.75 0.69
## Minimum correlation of possible factor scores  0.51 0.50 0.39

## Ein weiterer Vergleich - diesmal eher zur Unterschiedlichkeit der Pakete Maximum-Likelihood FA mit denselben Item-Daten

items.ml.varimax <- fa(items,
                      nfactors=3,   # extract three factors 
                      SMC=TRUE,     # squared multiple correlations as estimates of communalities
                      fm="ml",      # maximum likelihood
                      rotate="varimax", # Varimax rotation (orthogonal)
                      max.iter=100
                      )

# und die Ergebnisse inspizieren
print(items.ml.varimax)

## Factor Analysis using method =  ml
## Call: fa(r = items, nfactors = 3, rotate = "varimax", SMC = TRUE, max.iter = 100, 
##     fm = "ml")
## Standardized loadings (pattern matrix) based upon correlation matrix
##       ML1   ML2   ML3   h2   u2 com
## V1   0.20  0.44  0.03 0.24 0.76 1.4
## V2  -0.02  0.74 -0.01 0.55 0.45 1.0
## V3   0.11  0.55  0.15 0.34 0.66 1.2
## V4   0.20  0.40  0.27 0.27 0.73 2.3
## V5   0.34  0.46  0.25 0.38 0.62 2.4
## V6   0.54  0.05  0.12 0.30 0.70 1.1
## V7   0.44  0.10  0.13 0.22 0.78 1.3
## V8   0.46  0.03  0.05 0.21 0.79 1.0
## V9   0.54  0.24  0.17 0.38 0.62 1.6
## V10  0.61  0.21 -0.02 0.42 0.58 1.2
## V11  0.01 -0.04  0.63 0.39 0.61 1.0
## V12  0.20  0.24  0.44 0.29 0.71 2.0
## V13  0.02  0.03  0.52 0.27 0.73 1.0
## V14  0.22  0.14  0.31 0.16 0.84 2.2
## V15  0.15  0.18  0.42 0.23 0.77 1.6
## 
##                        ML1  ML2  ML3
## SS loadings           1.67 1.64 1.34
## Proportion Var        0.11 0.11 0.09
## Cumulative Var        0.11 0.22 0.31
## Proportion Explained  0.36 0.35 0.29
## Cumulative Proportion 0.36 0.71 1.00
## 
## Mean item complexity =  1.5
## Test of the hypothesis that 3 factors are sufficient.
## 
## The degrees of freedom for the null model are  105  and the objective function was  2.98 with Chi Square of  278.1
## The degrees of freedom for the model are 63  and the objective function was  0.7 
## 
## The root mean square of the residuals (RMSR) is  0.06 
## The df corrected root mean square of the residuals is  0.07 
## 
## The harmonic number of observations is  100 with the empirical chi square  65.12  with prob <  0.4 
## The total number of observations was  100  with MLE Chi Square =  63.69  with prob <  0.45 
## 
## Tucker Lewis Index of factoring reliability =  0.993
## RMSEA index =  0.031  and the 90 % confidence intervals are  NA 0.061
## BIC =  -226.4
## Fit based upon off diagonal values = 0.93
## Measures of factor score adequacy             
##                                                 ML1  ML2  ML3
## Correlation of scores with factors             0.81 0.83 0.79
## Multiple R square of scores with factors       0.66 0.70 0.62
## Minimum correlation of possible factor scores  0.33 0.39 0.25

Ein weiterer Vergleich - diesmal eher zur Unterschiedlichkeit der Pakete Maximum-Likelihood FA mit den crime-Daten aus Everitt (2010)

# Daten lesen
d.crime <-  read.delim(file="http://r.psych.bio.uni-goettingen.de/mv/data/be/pca_crime.txt")

crime.ml.varimax <- fa(d.crime[2:8],
                      nfactors=3,   # extract three factors 
                      SMC=TRUE,     # squared multiple correlations as estimates of communalities
                      fm="ml",      # maximum likelihood
                      rotate="varimax", # Varimax rotation (orthogonal)
                      max.iter=100
                      )

# und die Ergebnisse inspizieren
print(crime.ml.varimax)

## Factor Analysis using method =  ml
## Call: fa(r = d.crime[2:8], nfactors = 3, rotate = "varimax", SMC = TRUE, 
##     max.iter = 100, fm = "ml")
## Standardized loadings (pattern matrix) based upon correlation matrix
##            ML3  ML2  ML1   h2    u2 com
## murder    0.26 0.92 0.23 0.97 0.030 1.3
## rape      0.64 0.37 0.30 0.64 0.360 2.1
## robbery   0.24 0.66 0.56 0.81 0.185 2.2
## assault   0.49 0.63 0.33 0.75 0.253 2.5
## burglary  0.83 0.33 0.26 0.86 0.136 1.5
## theft     0.83 0.13 0.12 0.72 0.279 1.1
## vehicules 0.29 0.32 0.90 1.00 0.005 1.5
## 
##                        ML3  ML2  ML1
## SS loadings           2.24 2.05 1.46
## Proportion Var        0.32 0.29 0.21
## Cumulative Var        0.32 0.61 0.82
## Proportion Explained  0.39 0.36 0.25
## Cumulative Proportion 0.39 0.75 1.00
## 
## Mean item complexity =  1.7
## Test of the hypothesis that 3 factors are sufficient.
## 
## The degrees of freedom for the null model are  21  and the objective function was  5.82 with Chi Square of  272.5
## The degrees of freedom for the model are 3  and the objective function was  0.11 
## 
## The root mean square of the residuals (RMSR) is  0.02 
## The df corrected root mean square of the residuals is  0.04 
## 
## The harmonic number of observations is  51 with the empirical chi square  0.5  with prob <  0.92 
## The total number of observations was  51  with MLE Chi Square =  4.9  with prob <  0.18 
## 
## Tucker Lewis Index of factoring reliability =  0.945
## RMSEA index =  0.128  and the 90 % confidence intervals are  NA 0.282
## BIC =  -6.9
## Fit based upon off diagonal values = 1
## Measures of factor score adequacy             
##                                                 ML3  ML2  ML1
## Correlation of scores with factors             0.93 0.98 0.99
## Multiple R square of scores with factors       0.87 0.95 0.98
## Minimum correlation of possible factor scores  0.74 0.90 0.96

Übungen / Exercises

Vollziehen Sie die Beispiele der beiden empfohlenen Lehrbücher nach.

Vollziehen Sie die Übungsaufgaben mit Lösungen der beiden empfohlenen Lehrbücher nach.

Studierenden STAI Daten

[http://r.psych.bio.uni-goettingen.de/mv/data/div/stud_stai_items_utf8.txt]

Überprüfen Sie das Konzept von State- und Trait-Angst.

Links und Referenzen

lme4 Dokumentation [http://cran.r-project.org/web/packages/lme4/lme4.pdf]

R-Bloggers Tutorial John Quick [http://www.r-bloggers.com/r-tutorial-series-exploratory-factor-analysis/]

Tutorial Wollschläger: [http://www.uni-kiel.de/psychologie/rexrepos/posts/multFA.html]

Field (2012, p. 112) Kapitel 3: R Environment