Explorative Faktoranalyse

Faktoranalyse - Explorative FA Rmd

Eine vergleichende Abgrenzung der Arten von Faktoranalysen sowie übergreifender Themen zu den verschiedenen Arten der FA findet sich unter fa

$x_1 = l_{11}f_1 + l_{12}f_2 + u_1$

$s_1^2 = l_{11}^2 + l_{12}^2 + var(u_1)$

ML Analyse optimiert die Hauptachsen-Faktoranalyse.

Ein paar prinzipielle Bemerkungen

Die FA hat inhaltlich eine andere Zielsetzung als die PCA: Auffinden bzw. Erfassen latenter Konstrukte (latent Traits), die nicht direkt beobachtbar sind durch manifeste, messbare und damit beobachtbare Variablen. Frage: Reflektieren die erfassten Variablen eine alle gemeinsam beeinflussende Größe.

Alles für die PCA Gesagte behält seine Gültigkeit. Man kann die PCA als einen Spezialfall der FA auffassen.

Der Kernunterschied zur PCA ist, dass Restvarianzen bei den Variablen zugelassen werden (uniquenesses). Die Gesamtvarianz der Variable wird zerlegt in die gemeinsame Varianz (die auf den zugrunde liegenden Faktor zurückgeht), die spezifische Varianz (die spezifisch für dieses Item ist z. B. eine ganz bestimmte Unterkomponente des Konstruktes (z. B. Intelligenz) erfasst) und die Messfehlervarianz, die im psychologischen Umfeld normal ist.

Im Vergleich zur FA versucht die PCA grundsätzlich die Gesamtvarianz zu erklären, was im psychologischen Umfeld meist inadäquat ist. Daher werden die PCA Lösungen oft auch unstabiler als die FA Lösungen, beispielsweise bei Kreuzvalidierungen.

Eine gefundenen Struktur wird rotiert, um eine möglichst gute Interpretation der Faktoren zu ermöglichen. Es werden, je nach Rotationstyp, auch Abhängigkeiten zwischen den Faktoren zugelassen.

Gesamt-n ist sehr viel wichtiger, als Daumenregeln zum Verhältnis von Vpn zu Variablenzahl (10 - 15 pro Variable). Nach Field (2013, p. 684)field sind 300 Vpn ordentlich, 100 Vpn eher mager und 1000 excellent, relativ unabhängig von der Anzahl der zu faktorisierenden Variablen. Dies zeigt sich in der Stabilität der Faktorstruktur. Aber Faktorladungen und Zahl der Variablen mit diesen Ladungen größer Grenze sind zusätzlicher Einfluss. 4 oder mehr Ladungsgewichte > 0.6 oder 10 Ladungsgewichte > 0.40 jeweils pro Faktor führen bereits bei ca 150 Beobachtungen zu relativ stabilen und damit interpretierbaren Faktoren. Auch die Kommunalitäten spielen hier eine Rolle: Alle Kommunalitäten > .6 kann bereits ab ca 150 Vpn zu relativ verlässlichen Ergebnissen führen. Kommunalitäten durchgehend > .50 kann mit 100 - 200 Vpn schon relativ stabil werden.

Das Kaiser-Meyer-Olkin Kriterium schwankt zwischen 0 und 1. Empfohlen wird eine Faktorisierung erst bei Werten von über 0.5 (Field, 2013)field. Vgl. hierzu auch die Ausführungen zu KMO in fa Rmd html

Bartlett’s Test prüft, ob die vorliegende Korrelationsmatrix sig. verschieden von der Identity-Matrix (alle Nicht-Diagonal-Korrelationen sind null) ist.

Parallalelanalyse als Ansatz für die Festlegung der Faktoranzahl. Vgl. hierzu auch die Ausführungen zu Parallelanalyse in fa Rmd html

Field (2012, p. 755)

Bei orthogonaler Rotation sind die Gewichte der Variablen in den jeweiligen Faktor (coefficients) gleich der Korrelation dieser Variablen mit dem Faktor. Bei obliquer Rotation finden sich die Korrelationen der Variablen mit dem Faktor in der Strukturmatrix (structure matrix), während sich die Ladungsgewichte in der Ladungsmatrix (pattern matrix) finden.

Faktorwerte (factor scores) sind individuelle Ausprägungen der Faktoren für eine Beobachtung (Person). Bei Verwendung der (ungewichteten) b-Gewichte (Matrix A) sind die Werte skalenabhängig und die Korrelation zwischen den Messwerten wird nicht berücksichtigt. Gewichtete Ladungskoeffizienten können auf verschiedene Arten gebildet werden. Teilt man Matrix A durch die Korrelationsmatrix (R): $B = A * R^{-1}$, erhält man die bereinigten Ladungskoeffizienten{}? (unique relationship). (Ungewichtete und Gewichtete Faktorwerte).

Faktorwerte können benutzt werden, um Unterschiede zwischen Personen sparsamer zu beschreiben, als mit den Originalvariablen. Faktorwerte können bei Kollinearitätsproblemen ein Lösungsansatz sein (PCA).

Orthogonale Rotation

Die Faktoren bleiben unabhängig (orthogonal). Faktoren müssen ‘real’ auch unkorreliert sein.

Varimax-Rotation Varianz der quadrierten Faktorladungen pro Faktor (innerhalb) wird maximiert. Man möchte pro Faktor möglichst viele betraglich hohe oder um Null liegende Ladungen erhalten. Faktor besser interpretierbar als das Gemeinsame der hoch auf ihm ladenden Variablen.

Quatrimax-Rotation Pro Variable sollen über die Faktoren hinweg möglichst viele hohe oder Null-Ladungen auftreten, um die Variablen klarer den Faktoren zuordnen zu können (zwischen). Klare Zuordnung von Variablen zu Faktoren möglich. Führt häufig dazu, dass eine Variable in einen einzigen Faktor hoch lädt.

Equamax ist ein Hybrid aus beiden und wird teils nicht empfohlen (Field, 2013).

Oblique Rotation

In SPSS wie auch in Statistica: direct oblimin und promax.

Promax ist eine sog. Target-Rotationsmethode, d. h. es kann ein gewünschtes Kriteriums-Ladungsmuster vorgegeben werden. Wird kein bestimmtes Muster vorgegeben, rotiert Promax so, daß möglichst klar voneinander getrennte Item-Gruppen entstehen (ähnlich Varimax, aber eben mit potenziell korrelierten Faktoren).

Direct Oblimin versucht, die Kovarianzen zwischen den quadrierten Faktorladungen aller Paare von Faktoren zu minimieren.

Rotation in R

In R: Aus der Hilfe zu psych::fa() “none”, “varimax”, “quartimax”, “bentlerT”, “geominT” and “bifactor” are orthogonal rotations. “promax”, “oblimin”, “simplimax”, “bentlerQ,”geominQ” and “biquartimin” and “cluster” are possible rotations or transformations of the solution. The default is to do a oblimin transformation, although versions prior to 2009 defaulted to varimax.

R hat auch eigene Packages zur Faktorrotation, z. B. library(GPArotation).

Faktorladungen

Koeffizienten für jede Variable, die ausdrücken, welchen Einfluss die Variable auf die Bildung des Faktors hat. In PCA wenn von der Korrelationsmatrix aus gerechnet wurde: Korrelation der Variablen mit den Faktorwerten (Werte der Personen in diesem Faktor). Auch berechenbar über die Multiplikation der Loadings (Eigenvektor) mit der Standardabweichung des jeweiligen Faktors.

Faktorwerte

Werte der Vpn in dem jeweiligen Faktor. In R ergebnisobjekt$scores. In Statistica Faktorwerte genannt.

Eigenwerte der Hauptkomponenten

Eigenwert ist die Summe der quadrierten Ladungen eines Faktors (Faktorladungen) über alle Variablen. Eigenwert ist die Varianz dieses Faktors. Da die Gesamtvarianz der Variablen auf 1 gesetzt wird ist der Eigenwert zugleich der Anteil, den der Faktor an der Gesamtvarianz der beobachteten Variablen erklärt. Die Eigenwerte sind die quadrierten Standardabweichungen der Hauptkomponenten, die R im Summary zur PCA ausgibt (ergebnisobjekt$sdev). In der PCA ist die Gesamtsumme der Eigenwerte = der Menge der Hauptkomponenten. Der Mittelwert der Eigenwerte ist also 1.

Eigenvektor und Loadings

Statistica-Ergebnisdialog: Variablen | Eigenvektor entspricht R-Loadings ergebnisobjekt$loadings Die Summe der quadrierten Loadings über alle Variablen hinweg ergibt 1. Mit Hilfe der Eigenvektoren können die vorhergesagten Werte (Faktorwerte) errechnet werden.

Kommunalität

Parameter der Variablen. Die Summe der quadrierten Ladungen einer Variablen auf allen Faktoren ergibt die Varianz dieser Variablen, die durch die Faktoren gemeinsam erklärt wird. Diese Größe wird als Kommunalität h^2j einer Variablen j bezeichnet.

Einfachstruktur

Ziel der Rotation. Für alle Variablen soll erreicht werden, dass sie möglichst nur in einen Faktor hoch laden und in alle anderen Faktoren sehr niedrig. Hierdurch soll eine möglichst gute (einfache) Interpretation bzw. Benennung der Faktoren ermöglicht werden.

Reduzierte Korrelationsmatrix als Ausgangsmatrix bei FA

In die Diagonale kommen nicht, wie bei PCA, 1-en sondern Schätzer für die Kommunalitäten. Üblich sind die SMC (squared multiple correlations). Idee hier: Ein Item wird via MR aus allen verbleibenden Items vorhergesagt. R^2 ist der SMC-Wert für dieses Item.

Hier demonstriert am Beispiel der werner-fa, generierte Werte von 15 Items und für 100 Beobachtungen (vgl. Beispiel weiter unten).

items <- read.delim(file="https://md.psych.bio.uni-goettingen.de/mv/data/div/werner-fa.txt")
head(round(items, 2))

##      V1    V2    V3    V4    V5    V6    V7    V8    V9  V10   V11   V12   V13
## 1  2.03 -1.18 -0.29 -1.01 -0.31  0.36  0.42  2.07  0.90 0.56 -0.99 -0.74 -0.68
## 2  0.34  0.51  1.13 -0.73  0.13 -0.25 -0.79  1.02  0.51 0.18 -0.83 -0.50  1.09
## 3  1.16  1.77  1.49  1.20  0.30  1.03  1.18  0.45  0.58 0.20 -1.80  0.33 -0.49
## 4  0.08  0.50  0.78  2.43 -0.34 -1.34 -1.19 -0.96  1.34 0.07 -0.13  0.73  0.31
## 5 -0.26  1.15  1.46 -0.11  0.23 -0.56 -0.53 -0.47  0.06 0.31 -0.85 -0.83 -1.83
## 6 -0.38  0.72 -0.07  0.77  1.09 -0.59  0.99 -2.13 -2.53 0.06  1.54 -0.16  0.69
##     V14   V15
## 1 -1.68 -0.10
## 2  1.14  0.10
## 3  0.22  1.16
## 4 -0.49 -0.74
## 5 -0.41 -0.53
## 6  0.88 -0.48

# take a look at correlation table
head(round(cor(items), 2))

##      V1   V2   V3   V4   V5   V6   V7   V8   V9  V10   V11  V12  V13  V14  V15
## V1 1.00 0.32 0.26 0.32 0.30 0.13 0.21 0.18 0.24 0.12  0.00 0.01 0.02 0.09 0.19
## V2 0.32 1.00 0.41 0.26 0.34 0.03 0.03 0.03 0.15 0.16 -0.04 0.17 0.03 0.10 0.15
## V3 0.26 0.41 1.00 0.28 0.29 0.13 0.13 0.07 0.26 0.12  0.08 0.26 0.03 0.24 0.12
## V4 0.32 0.26 0.28 1.00 0.29 0.14 0.19 0.05 0.21 0.24  0.13 0.32 0.23 0.19 0.15
## V5 0.30 0.34 0.29 0.29 1.00 0.15 0.27 0.15 0.38 0.34  0.22 0.29 0.10 0.14 0.22
## V6 0.13 0.03 0.13 0.14 0.15 1.00 0.22 0.32 0.28 0.36  0.03 0.28 0.07 0.17 0.18

# insert SMC
items.cors.reduced <- cor(items)
# put smc into the diagonal
require(psych)

## Lade nötiges Paket: psych

diag(items.cors.reduced) <- smc(items)
# take a look at the resulting table
head(round(items.cors.reduced, 2))

##      V1   V2   V3   V4   V5   V6   V7   V8   V9  V10   V11  V12  V13  V14  V15
## V1 0.29 0.32 0.26 0.32 0.30 0.13 0.21 0.18 0.24 0.12  0.00 0.01 0.02 0.09 0.19
## V2 0.32 0.29 0.41 0.26 0.34 0.03 0.03 0.03 0.15 0.16 -0.04 0.17 0.03 0.10 0.15
## V3 0.26 0.41 0.29 0.28 0.29 0.13 0.13 0.07 0.26 0.12  0.08 0.26 0.03 0.24 0.12
## V4 0.32 0.26 0.28 0.29 0.29 0.14 0.19 0.05 0.21 0.24  0.13 0.32 0.23 0.19 0.15
## V5 0.30 0.34 0.29 0.29 0.36 0.15 0.27 0.15 0.38 0.34  0.22 0.29 0.10 0.14 0.22
## V6 0.13 0.03 0.13 0.14 0.15 0.26 0.22 0.32 0.28 0.36  0.03 0.28 0.07 0.17 0.18

# we calculate the first smc value (V1)
smc1 <- lm(V1 ~ V2 + V3 + V4 + V5 + V6  + V7  + V8  + V9  +V10  + V11 + V12 + V13 + V14 + V15, data=items)
# the smc can be found as 'Multiple R-squared: ' in summary
summary(smc1)

## 
## Call:
## lm(formula = V1 ~ V2 + V3 + V4 + V5 + V6 + V7 + V8 + V9 + V10 + 
##     V11 + V12 + V13 + V14 + V15, data = items)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -2.30151 -0.43463 -0.02142  0.40098  2.19926 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)  
## (Intercept)  0.03890    0.10598   0.367   0.7145  
## V2           0.17053    0.10725   1.590   0.1155  
## V3           0.08054    0.11735   0.686   0.4944  
## V4           0.24740    0.09980   2.479   0.0151 *
## V5           0.16128    0.11066   1.457   0.1487  
## V6           0.05120    0.10484   0.488   0.6265  
## V7           0.08349    0.10792   0.774   0.4413  
## V8           0.12753    0.10310   1.237   0.2195  
## V9           0.06178    0.11482   0.538   0.5920  
## V10         -0.12852    0.11467  -1.121   0.2655  
## V11         -0.11505    0.10714  -1.074   0.2859  
## V12         -0.24437    0.11706  -2.088   0.0398 *
## V13         -0.04302    0.10368  -0.415   0.6793  
## V14          0.04886    0.10178   0.480   0.6324  
## V15          0.20875    0.12269   1.701   0.0925 .
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.9265 on 85 degrees of freedom
## Multiple R-squared:  0.2874, Adjusted R-squared:  0.1701 
## F-statistic: 2.449 on 14 and 85 DF,  p-value: 0.006002

Chi^2 Test ob die Anzahl der Faktoren ausreicht

vgl. fa Rmd html

Faktoranalyse in R Base

Base Package: factanal()

Für Datenmatrix dd, zwei zu extrahierende Faktoren und Maximum-Likelihood-Extraktion: factanal(dd,factors=2,method="mle")

Eher auf die Bedürfnisse in Psychologie ausgerichtet, flexibler und hier durchgängig verwendet: Das Package psych library(psych) und hier die Funktion fa()

Core R includes a maximum likelihood factor analysis function (factanal) and the psych package includes five alternative factor extraction options within one function, fa().

Hier eine kurze Darstellung der wichtigsten Parameter

require(psych)
fa(
  data,     # Datenmatrix (oder Korrelationstabelle, die aber auch automatisch generiert wird)
  nfactors=3,   # Anzahl der Faktoren
  fm="ml",  # Extraktionsverfahren (factoring method) ml (maximum likelyhood), 
        # fm="minres" will do a minimum residual (OLS), 
        # fm="wls" will do a weighted least squares (WLS) solution, 
        # fm="gls" does a generalized weighted least squares (GLS), 
        # fm="pa" will do the principal factor solution, 
        # fm="ml" will do a maximum likelihood factor analysis
  rotate="varimax",
        # "none", "varimax", "quartimax", "bentlerT", and "geominT" are orthogonal rotations. 
        # "promax", "oblimin", "simplimax", "bentlerQ, and "geominQ" or "cluster" are possible rotations or transformations of the solution. 
        # The default is to do a oblimin transformation, although prior versions defaulted to varimax.
  SMC=TRUE  # SMC    Use squared multiple correlations (SMC=TRUE) or use 1 as initial communality estimate. 
        # Try using 1 if imaginary eigen values are reported.
  max.iter=100  # Iterationsstufen
  scores=TRUE   # Factorscores mit berechnen. Default ist scores=FALSE
  )

Beipiel: Wahrnehmung eigener und fremder Gefühle (Emotionale Intelligenzforschung)

Das Beispiel lehnt sich an eine Studie von Tanja Lischetzke et al (2001, Diagnostica Vol 47, No. 4, S. 167 - 177):

Aus dem Abstract: “Das Erkennen der eigenen Gefühle und der Gefühle anderer Menschen ist eine wichtige Kompetenz im Umgang mit Emotionen und Stimmungen. Es werden die bisher vor allem im englischen Sprachraum untersuchten Konstrukte der emotionalen Selbstaufmerksamkeit und der Klarheit über eigene Gefühle vorgestellt und die konzeptuelle Trennung der Konstrukte erstmals auf die Wahrnehmung fremder Gefühle übertragen.”

Die Ausführungen hier beziehen sich allerdings nur auf den Teil der Wahrnehmung eigener Gefühle.

Ein erfundener Datensatz findet sich im üblichen tab-delimited Textformat unter: fa-ei.txt

Ansatz: Findet sich das zwei Faktor Modell (EA Emotional Attention) (EC Emotional Clarity) im gegebenen Datensatz?

Die beiden Skalen ‘emotionale Selbstaufmerksamkeit’ (EA Emotional Attention) sowie ‘Klarheit über eigene Gefühle’ (EC Emotional Clarity) sind theoretisch angenommen und über entsprechende Formulierungen sprachlich umgesetzt. Alle ungeraden Items erfassen ‘emotionale Selbstaufmerksamkeit’ (EA), alle geraden Items ‘Klarheit über eigene Gefühle’ (EC).

Die Fragen/Items sind:

i1 Ich denke über meine Gefühle nach.
i2 Ich kann meine Gefühle benennen.
i3 Ich schenke meinen Gefühlen Aufmerksamkeit.
i4 Ich bin mir im unklaren darüber, was ich fühle.
i5 Ich beschäftige mich mit meinen Gefühlen.
i6 Ich habe Schwierigkeiten, meine Gefühle zu beschreiben.
i7 Ich denke darüber nach, wie ich mich fühle.
i8 Ich weiß, was ich fühle.
i9 Ich beobachte meine Gefühle.
i10 Ich habe Schwierigkeiten, meinen Gefühlen einen Namen zu geben.
i11 Ich ache darauf, wie ich mich fühle.
i12 Ich bin mir unsicher, was ich eigentlich fühle.

Die Items sind skaliert von 1 bis 4 (fast nie/manchmal/oft/fast immer)

Die beiden latenten Konstrukte (latent Traits) um die es hier geht, sind:

emotionalen Selbstaufmerksamkeit eSA
Klarheit über eigene Gefühle eC

Reduzierte Korrelationsmatrix als Ausgangsmatrix bei FA

Hier demonstriert am Beispiel der werner-fa, generierte Werte von 15 Items und für 100 Beobachtungen (vgl. Beispiel weiter unten).

dd <- read.delim("https://md.psych.bio.uni-goettingen.de/mv/data/virt/v_ei.txt")
# take a look at correlation table
head(round(cor(dd), 2))

##       i1    i2    i3    i4    i5    i6    i7    i8    i9   i10   i11   i12
## i1  1.00  0.18  0.35 -0.06  0.17 -0.23  0.18  0.14  0.43 -0.08  0.30 -0.11
## i2  0.18  1.00  0.14 -0.48  0.18 -0.52  0.21  0.57  0.18 -0.61  0.12 -0.60
## i3  0.35  0.14  1.00 -0.07  0.30 -0.11  0.13  0.25  0.34 -0.25  0.33 -0.05
## i4 -0.06 -0.48 -0.07  1.00 -0.12  0.63 -0.12 -0.45 -0.11  0.53 -0.15  0.54
## i5  0.17  0.18  0.30 -0.12  1.00 -0.04  0.18  0.31  0.26 -0.13  0.33 -0.11
## i6 -0.23 -0.52 -0.11  0.63 -0.04  1.00 -0.04 -0.47 -0.18  0.58 -0.04  0.45

# insert SMC
items.cors.reduced <- cor(dd)
# put smc into the diagonal
require(psych)
diag(items.cors.reduced) <- smc(dd)
# take a look at the resulting table
head(round(items.cors.reduced, 2))

##       i1    i2    i3    i4    i5    i6    i7    i8    i9   i10   i11   i12
## i1  0.34  0.18  0.35 -0.06  0.17 -0.23  0.18  0.14  0.43 -0.08  0.30 -0.11
## i2  0.18  0.55  0.14 -0.48  0.18 -0.52  0.21  0.57  0.18 -0.61  0.12 -0.60
## i3  0.35  0.14  0.28 -0.07  0.30 -0.11  0.13  0.25  0.34 -0.25  0.33 -0.05
## i4 -0.06 -0.48 -0.07  0.52 -0.12  0.63 -0.12 -0.45 -0.11  0.53 -0.15  0.54
## i5  0.17  0.18  0.30 -0.12  0.24 -0.04  0.18  0.31  0.26 -0.13  0.33 -0.11
## i6 -0.23 -0.52 -0.11  0.63 -0.04  0.56 -0.04 -0.47 -0.18  0.58 -0.04  0.45

# we calculate the first smc value (i1)
#attach(items)
smc1 <- lm(i1 ~ i2 + i3 + i4 + i5 + i6 + i7 + i8 + i9 + i10 + i11 + i12, data=dd)
# the smc can be found as 'Multiple R-squared: ' in summary
summary(smc1)

## 
## Call:
## lm(formula = i1 ~ i2 + i3 + i4 + i5 + i6 + i7 + i8 + i9 + i10 + 
##     i11 + i12, data = dd)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.0738 -0.4753  0.0778  0.5177  1.7696 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)   
## (Intercept)  0.30304    0.81405   0.372  0.71059   
## i2           0.11217    0.12367   0.907  0.36688   
## i3           0.22788    0.10779   2.114  0.03734 * 
## i4           0.15480    0.11993   1.291  0.20015   
## i5          -0.01831    0.09961  -0.184  0.85461   
## i6          -0.31743    0.12186  -2.605  0.01079 * 
## i7           0.10940    0.09316   1.174  0.24340   
## i8          -0.04855    0.11034  -0.440  0.66098   
## i9           0.29539    0.09926   2.976  0.00377 **
## i10          0.21944    0.12051   1.821  0.07202 . 
## i11          0.23344    0.12251   1.905  0.05999 . 
## i12         -0.03607    0.12371  -0.292  0.77127   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.8042 on 88 degrees of freedom
## Multiple R-squared:  0.3355, Adjusted R-squared:  0.2524 
## F-statistic: 4.039 on 11 and 88 DF,  p-value: 8.713e-05

Auswertung

# get data
dd <- read.delim("https://md.psych.bio.uni-goettingen.de/mv/data/virt/v_ei.txt")
# take a look at the data
head(dd)

##   i1 i2 i3 i4 i5 i6 i7 i8 i9 i10 i11 i12
## 1  4  3  4  2  3  1  4  3  3   1   3   3
## 2  4  3  3  1  4  1  4  4  4   1   3   2
## 3  4  3  4  2  4  2  4  2  4   1   3   3
## 4  3  3  2  2  1  2  4  3  1   2   2   3
## 5  4  4  4  4  4  1  4  4  4   2   3   1
## 6  2  4  3  2  4  2  2  3  3   2   3   2

# we need library(psych)
require(psych)

# some descriptives
psych::describe(dd)

##     vars   n mean   sd median trimmed  mad min max range  skew kurtosis   se
## i1     1 100 3.06 0.93      3    3.15 1.48   1   4     3 -0.56    -0.77 0.09
## i2     2 100 3.08 0.97      3    3.20 1.48   1   4     3 -0.68    -0.68 0.10
## i3     3 100 3.02 0.86      3    3.09 1.48   1   4     3 -0.50    -0.56 0.09
## i4     4 100 1.87 0.96      2    1.71 1.48   1   4     3  0.94    -0.10 0.10
## i5     5 100 3.00 0.93      3    3.09 1.48   1   4     3 -0.52    -0.75 0.09
## i6     6 100 1.91 0.96      2    1.77 1.48   1   4     3  0.78    -0.44 0.10
## i7     7 100 2.91 0.92      3    2.98 1.48   1   4     3 -0.36    -0.86 0.09
## i8     8 100 3.05 1.04      3    3.19 1.48   1   4     3 -0.74    -0.71 0.10
## i9     9 100 3.01 0.90      3    3.08 1.48   1   4     3 -0.42    -0.88 0.09
## i10   10 100 1.94 0.99      2    1.81 1.48   1   4     3  0.67    -0.74 0.10
## i11   11 100 2.90 0.75      3    2.91 0.00   1   4     3 -0.28    -0.27 0.07
## i12   12 100 1.86 0.92      2    1.76 1.48   1   4     3  0.66    -0.71 0.09

# correlation matrix is base of EFA and with this the solution the reproduced correlation matrix will be compared
# covariance matrix, wrap with round() for better reading
round(var(dd), 2)

##        i1    i2    i3    i4    i5    i6    i7    i8    i9   i10   i11   i12
## i1   0.87  0.17  0.28 -0.05  0.15 -0.21  0.16  0.14  0.36 -0.08  0.21 -0.09
## i2   0.17  0.94  0.12 -0.44  0.16 -0.49  0.19  0.57  0.16 -0.59  0.09 -0.53
## i3   0.28  0.12  0.75 -0.06  0.24 -0.09  0.10  0.22  0.26 -0.21  0.21 -0.04
## i4  -0.05 -0.44 -0.06  0.92 -0.11  0.58 -0.10 -0.45 -0.10  0.51 -0.10  0.48
## i5   0.15  0.16  0.24 -0.11  0.87 -0.04  0.15  0.30  0.22 -0.12  0.23 -0.09
## i6  -0.21 -0.49 -0.09  0.58 -0.04  0.93 -0.04 -0.47 -0.16  0.56 -0.03  0.40
## i7   0.16  0.19  0.10 -0.10  0.15 -0.04  0.85  0.18  0.14 -0.14  0.09 -0.02
## i8   0.14  0.57  0.22 -0.45  0.30 -0.47  0.18  1.08  0.21 -0.59  0.14 -0.51
## i9   0.36  0.16  0.26 -0.10  0.22 -0.16  0.14  0.21  0.82 -0.17  0.17 -0.11
## i10 -0.08 -0.59 -0.21  0.51 -0.12  0.56 -0.14 -0.59 -0.17  0.99 -0.13  0.44
## i11  0.21  0.09  0.21 -0.10  0.23 -0.03  0.09  0.14  0.17 -0.13  0.56 -0.12
## i12 -0.09 -0.53 -0.04  0.48 -0.09  0.40 -0.02 -0.51 -0.11  0.44 -0.12  0.85

# correlation matrix, wrap with round() for better reading
round(cor(dd), 2)

##        i1    i2    i3    i4    i5    i6    i7    i8    i9   i10   i11   i12
## i1   1.00  0.18  0.35 -0.06  0.17 -0.23  0.18  0.14  0.43 -0.08  0.30 -0.11
## i2   0.18  1.00  0.14 -0.48  0.18 -0.52  0.21  0.57  0.18 -0.61  0.12 -0.60
## i3   0.35  0.14  1.00 -0.07  0.30 -0.11  0.13  0.25  0.34 -0.25  0.33 -0.05
## i4  -0.06 -0.48 -0.07  1.00 -0.12  0.63 -0.12 -0.45 -0.11  0.53 -0.15  0.54
## i5   0.17  0.18  0.30 -0.12  1.00 -0.04  0.18  0.31  0.26 -0.13  0.33 -0.11
## i6  -0.23 -0.52 -0.11  0.63 -0.04  1.00 -0.04 -0.47 -0.18  0.58 -0.04  0.45
## i7   0.18  0.21  0.13 -0.12  0.18 -0.04  1.00  0.18  0.17 -0.15  0.13 -0.03
## i8   0.14  0.57  0.25 -0.45  0.31 -0.47  0.18  1.00  0.23 -0.57  0.18 -0.53
## i9   0.43  0.18  0.34 -0.11  0.26 -0.18  0.17  0.23  1.00 -0.19  0.26 -0.13
## i10 -0.08 -0.61 -0.25  0.53 -0.13  0.58 -0.15 -0.57 -0.19  1.00 -0.17  0.48
## i11  0.30  0.12  0.33 -0.15  0.33 -0.04  0.13  0.18  0.26 -0.17  1.00 -0.17
## i12 -0.11 -0.60 -0.05  0.54 -0.11  0.45 -0.03 -0.53 -0.13  0.48 -0.17  1.00

# Ask Bartlett test whether correlation matrix is idendity matrix (all elements are 0), should be significant
cortest.bartlett(dd)

## R was not square, finding R from data

## $chisq
## [1] 374.6439
## 
## $p.value
## [1] 1.072894e-44
## 
## $df
## [1] 66

# alternativly on correlation matrix (add n) 
cortest.bartlett(dd, n=nrow(dd))

## R was not square, finding R from data

## $chisq
## [1] 374.6439
## 
## $p.value
## [1] 1.072894e-44
## 
## $df
## [1] 66

# get kmo: are data good for factorization
psych::KMO(dd)

## Kaiser-Meyer-Olkin factor adequacy
## Call: psych::KMO(r = dd)
## Overall MSA =  0.81
## MSA for each item = 
##   i1   i2   i3   i4   i5   i6   i7   i8   i9  i10  i11  i12 
## 0.64 0.86 0.77 0.82 0.74 0.78 0.67 0.88 0.83 0.84 0.74 0.82

# determinant ('area' of data) should be higher than 0.00001, singularity might wait ...
det(cor(dd))

## [1] 0.01871332

# number of factors suggested by fa.parallel() is 2
items.parallel <- fa.parallel(dd, fa="fa")

## Parallel analysis suggests that the number of factors =  2  and the number of components =  NA

# we go on with 2 factors
# we start with an unrotated solution using ML (maximum likelihood). SMC is inserted as estimates for communality.
m.ml.u <- fa(dd, 
    nfactors=2,
    n.obs = nrow(dd),
  SMC=TRUE,
  fm="ml",
  rotate="none",
  max.iter=100
  )

# get the quality
print(m.ml.u)

## Factor Analysis using method =  ml
## Call: fa(r = dd, nfactors = 2, n.obs = nrow(dd), rotate = "none", SMC = TRUE, 
##     max.iter = 100, fm = "ml")
## Standardized loadings (pattern matrix) based upon correlation matrix
##       ML1   ML2    h2   u2 com
## i1   0.26  0.50 0.315 0.68 1.5
## i2   0.76 -0.08 0.587 0.41 1.0
## i3   0.28  0.55 0.379 0.62 1.5
## i4  -0.69  0.21 0.522 0.48 1.2
## i5   0.27  0.41 0.241 0.76 1.7
## i6  -0.71  0.17 0.531 0.47 1.1
## i7   0.21  0.22 0.091 0.91 2.0
## i8   0.73  0.04 0.531 0.47 1.0
## i9   0.31  0.49 0.340 0.66 1.7
## i10 -0.77  0.08 0.594 0.41 1.0
## i11  0.26  0.45 0.268 0.73 1.6
## i12 -0.68  0.17 0.497 0.50 1.1
## 
##                        ML1  ML2
## SS loadings           3.57 1.33
## Proportion Var        0.30 0.11
## Cumulative Var        0.30 0.41
## Proportion Explained  0.73 0.27
## Cumulative Proportion 0.73 1.00
## 
## Mean item complexity =  1.4
## Test of the hypothesis that 2 factors are sufficient.
## 
## The degrees of freedom for the null model are  66  and the objective function was  3.98 with Chi Square of  374.64
## The degrees of freedom for the model are 43  and the objective function was  0.55 
## 
## The root mean square of the residuals (RMSR) is  0.05 
## The df corrected root mean square of the residuals is  0.06 
## 
## The harmonic number of observations is  100 with the empirical chi square  32.4  with prob <  0.88 
## The total number of observations was  100  with Likelihood Chi Square =  50.99  with prob <  0.19 
## 
## Tucker Lewis Index of factoring reliability =  0.96
## RMSEA index =  0.042  and the 90 % confidence intervals are  0 0.084
## BIC =  -147.03
## Fit based upon off diagonal values = 0.97
## Measures of factor score adequacy             
##                                                    ML1  ML2
## Correlation of (regression) scores with factors   0.94 0.82
## Multiple R square of scores with factors          0.88 0.67
## Minimum correlation of possible factor scores     0.77 0.34

# take a closer look at factor loadings, which gives us the structure
print(m.ml.u$loadings)

## 
## Loadings:
##     ML1    ML2   
## i1   0.263  0.496
## i2   0.762       
## i3   0.284  0.547
## i4  -0.692  0.210
## i5   0.267  0.412
## i6  -0.709  0.168
## i7   0.206  0.221
## i8   0.728       
## i9   0.312  0.492
## i10 -0.767       
## i11  0.257  0.449
## i12 -0.684  0.172
## 
##                  ML1   ML2
## SS loadings    3.573 1.325
## Proportion Var 0.298 0.110
## Cumulative Var 0.298 0.408

# we might see the structure clearer by suppressing the output of low loadings by modifying cutoff, default is .1
print(m.ml.u$loadings, cutoff=.4)

## 
## Loadings:
##     ML1    ML2   
## i1          0.496
## i2   0.762       
## i3          0.547
## i4  -0.692       
## i5          0.412
## i6  -0.709       
## i7               
## i8   0.728       
## i9          0.492
## i10 -0.767       
## i11         0.449
## i12 -0.684       
## 
##                  ML1   ML2
## SS loadings    3.573 1.325
## Proportion Var 0.298 0.110
## Cumulative Var 0.298 0.408

# at least item i7 seems to stay unclear
# maybe rotation helps
# we do the same model with orthogonal rotation varimax
m.ml.r <- fa(dd, 
    nfactors=2,
    n.obs = nrow(dd),
  SMC=TRUE,
  fm="ml",
  #rotate="varimax",
  rotate="promax",
    max.iter=100
  )

## Lade nötigen Namensraum: GPArotation

print(m.ml.r)

## Factor Analysis using method =  ml
## Call: fa(r = dd, nfactors = 2, n.obs = nrow(dd), rotate = "promax", 
##     SMC = TRUE, max.iter = 100, fm = "ml")
## Standardized loadings (pattern matrix) based upon correlation matrix
##       ML1   ML2    h2   u2 com
## i1   0.05  0.58 0.315 0.68 1.0
## i2  -0.75  0.05 0.587 0.41 1.0
## i3   0.06  0.64 0.379 0.62 1.0
## i4   0.76  0.10 0.522 0.48 1.0
## i5  -0.01  0.49 0.241 0.76 1.0
## i6   0.75  0.06 0.531 0.47 1.0
## i7  -0.06  0.27 0.091 0.91 1.1
## i8  -0.64  0.18 0.531 0.47 1.1
## i9   0.00  0.58 0.340 0.66 1.0
## i10  0.75 -0.05 0.594 0.41 1.0
## i11  0.02  0.53 0.268 0.73 1.0
## i12  0.73  0.06 0.497 0.50 1.0
## 
##                        ML1  ML2
## SS loadings           3.19 1.71
## Proportion Var        0.27 0.14
## Cumulative Var        0.27 0.41
## Proportion Explained  0.65 0.35
## Cumulative Proportion 0.65 1.00
## 
##  With factor correlations of 
##       ML1   ML2
## ML1  1.00 -0.39
## ML2 -0.39  1.00
## 
## Mean item complexity =  1
## Test of the hypothesis that 2 factors are sufficient.
## 
## The degrees of freedom for the null model are  66  and the objective function was  3.98 with Chi Square of  374.64
## The degrees of freedom for the model are 43  and the objective function was  0.55 
## 
## The root mean square of the residuals (RMSR) is  0.05 
## The df corrected root mean square of the residuals is  0.06 
## 
## The harmonic number of observations is  100 with the empirical chi square  32.4  with prob <  0.88 
## The total number of observations was  100  with Likelihood Chi Square =  50.99  with prob <  0.19 
## 
## Tucker Lewis Index of factoring reliability =  0.96
## RMSEA index =  0.042  and the 90 % confidence intervals are  0 0.084
## BIC =  -147.03
## Fit based upon off diagonal values = 0.97
## Measures of factor score adequacy             
##                                                    ML1  ML2
## Correlation of (regression) scores with factors   0.94 0.85
## Multiple R square of scores with factors          0.88 0.73
## Minimum correlation of possible factor scores     0.76 0.46

# we might see the structure clearer by suppressing the output of low loadings by modifying cutoff, default is .1
print(m.ml.r$loadings, cutoff=.2)

## 
## Loadings:
##     ML1    ML2   
## i1          0.578
## i2  -0.747       
## i3          0.635
## i4   0.757       
## i5          0.489
## i6   0.749       
## i7          0.273
## i8  -0.643       
## i9          0.582
## i10  0.750       
## i11         0.526
## i12  0.727       
## 
##                  ML1   ML2
## SS loadings    3.205 1.720
## Proportion Var 0.267 0.143
## Cumulative Var 0.267 0.410

# even better
print.psych(m.ml.r, cut=0.2, sort=T)

## Factor Analysis using method =  ml
## Call: fa(r = dd, nfactors = 2, n.obs = nrow(dd), rotate = "promax", 
##     SMC = TRUE, max.iter = 100, fm = "ml")
## Standardized loadings (pattern matrix) based upon correlation matrix
##     item   ML1   ML2    h2   u2 com
## i4     4  0.76       0.522 0.48 1.0
## i10   10  0.75       0.594 0.41 1.0
## i6     6  0.75       0.531 0.47 1.0
## i2     2 -0.75       0.587 0.41 1.0
## i12   12  0.73       0.497 0.50 1.0
## i8     8 -0.64       0.531 0.47 1.1
## i3     3        0.64 0.379 0.62 1.0
## i9     9        0.58 0.340 0.66 1.0
## i1     1        0.58 0.315 0.68 1.0
## i11   11        0.53 0.268 0.73 1.0
## i5     5        0.49 0.241 0.76 1.0
## i7     7        0.27 0.091 0.91 1.1
## 
##                        ML1  ML2
## SS loadings           3.19 1.71
## Proportion Var        0.27 0.14
## Cumulative Var        0.27 0.41
## Proportion Explained  0.65 0.35
## Cumulative Proportion 0.65 1.00
## 
##  With factor correlations of 
##       ML1   ML2
## ML1  1.00 -0.39
## ML2 -0.39  1.00
## 
## Mean item complexity =  1
## Test of the hypothesis that 2 factors are sufficient.
## 
## The degrees of freedom for the null model are  66  and the objective function was  3.98 with Chi Square of  374.64
## The degrees of freedom for the model are 43  and the objective function was  0.55 
## 
## The root mean square of the residuals (RMSR) is  0.05 
## The df corrected root mean square of the residuals is  0.06 
## 
## The harmonic number of observations is  100 with the empirical chi square  32.4  with prob <  0.88 
## The total number of observations was  100  with Likelihood Chi Square =  50.99  with prob <  0.19 
## 
## Tucker Lewis Index of factoring reliability =  0.96
## RMSEA index =  0.042  and the 90 % confidence intervals are  0 0.084
## BIC =  -147.03
## Fit based upon off diagonal values = 0.97
## Measures of factor score adequacy             
##                                                    ML1  ML2
## Correlation of (regression) scores with factors   0.94 0.85
## Multiple R square of scores with factors          0.88 0.73
## Minimum correlation of possible factor scores     0.76 0.46

# take a look at the factor scores of the first few observations
head(round(m.ml.r$scores, 2))

##        ML1   ML2
## [1,] -0.16  0.72
## [2,] -0.73  0.93
## [3,]  0.18  1.07
## [4,]  0.35 -1.38
## [5,] -0.33  1.34
## [6,] -0.11 -0.08

# we might want to continue using the factor scores and therefore add them to dataframe with verbose name
colnames(m.ml.r$scores) <-  c('e.unclear.r', 'e.attention')
dd <- cbind(dd, m.ml.r$scores)

# descriptives of factor scores might be of interest
psych::describe(m.ml.r$scores)

##             vars   n mean   sd median trimmed  mad   min  max range  skew
## e.unclear.r    1 100    0 0.94  -0.28   -0.10 0.89 -1.15 2.38  3.54  0.81
## e.attention    2 100    0 0.85   0.06    0.03 0.95 -2.11 1.55  3.66 -0.25
##             kurtosis   se
## e.unclear.r    -0.25 0.09
## e.attention    -0.73 0.09

# we might be interested in the person with lowest emotionional clearness
dd[m.ml.r$scores[,1] == min(m.ml.r$scores[,1]),]

##    i1 i2 i3 i4 i5 i6 i7 i8 i9 i10 i11 i12 e.unclear.r e.attention
## 10  4  4  4  1  3  1  4  4  4   1   4   1   -1.153623    1.366079

# its subject 10 who has extreme answers in almost all questions

# finally we might want to visualize the model
# we can use the usual plot() method
plot(m.ml.r)

# or the fa.diagram() of library(psych), parameter cut is equivalent to looking at scores, parameter simple suppresses inclusion of cross-loadings
fa.diagram(m.ml.r, simple=TRUE, cut=.2, digits=2)

Alle Indices zeigen die prinzipielle Eignung des Datensatzes für eine Faktorisierung.

Der Bartlett-Test ob die Korrelationsmatrix von einer Idenditätsmatrix verschieden ist, wird signifikant.
Der globale KMO-Wert für die Eignung der Daten zur Faktorisierung (overall MSA) liegt mit .81 ‘im guten Bereich’.
Jeder einzelne MSA-Wert der Variablen liegt über .5, ein Großteil ebenfalls im ‘guten Bereich’.
Auch die Determinante zeigt keinerlei Hinweis auf Singularität.

Die Parallelanalyse schlägt eine Lösung mit 2 Faktoren als optimal vor. Das deckt sich mit den konzeptionellen Überlegungen, die in die Itemformulierungen eingegangen sind.

Eine explorative Maximum Likelihood Faktorisierung mit 2 Faktoren kann insgesamt 41% der Varianz aufklären. Der erste Faktor bindet 26%, der zweite nochmals 15% der Originalvarianz.

Die unrotierte EFA mit 2 Faktoren als Vorgabe zeigt vor allem für Item i7 keine klare Zuordnung. Nach Rotation werden alle Ladungskoeffizienten höher im Vergleich zur unrotierten Lösung. Allerdings bleibt der Ladungskoeffizient für Item i7 vergleichsweise niedrig. Trotzdem passt er tendenziell zur inhaltlichen Vorgabe.

Somit zeigt sich eine klare Faktorstruktur in der rotierten Faktorladungs-Matrix. Jedes zweite Item gehört zu je einem Faktor. Inhaltlich können die Items klar den beiden intendierten Dimensionen

eSA (emotional Self Attention, Emotionale Selbstaufmerksamkeit) und
eC (emotional Clearness, Klarheit über eigene Gefühle)

zugeordnet werden.

Auch die Vorzeichen stimmen mit der Item-Formulierung überein.

Zum com Feld in der Ausgabe bei Pattern Matrix:

Laut dem Autor des Pakets William Revelle : The ‘com’ column is factor complexity using the index developed by Hofmann (1978). It is a row wise measure of item. complexity.

bzw. aus der Hilfe:

complexity
Hoffman’s index of complexity for each item. This is just \[ \frac{(Σ a_i^2)^2}{Σ a_i^4} \] where $a_i$ is the factor loading on the ith factor. From Hofmann (1978), MBR. See also Pettersson and Turkheimer (2010).

und aus dem Hofman Artikel:

The complexity index is a positive number indicating on the average how many factors are used to explain each variable in a factor solution.

oblique Rotation

dd <- read.delim("https://md.psych.bio.uni-goettingen.de/mv/data/virt/v_ei.txt")
require(psych)

# fit orthogonal and oblique model
m.ml.orth <- fa(dd, 
    nfactors=2,
    n.obs = nrow(dd),
  SMC=TRUE,
  fm="ml",
  rotate="varimax",
  max.iter=100
  )
m.ml.obl <- fa(dd, 
    nfactors=2,
    n.obs = nrow(dd),
  SMC=TRUE,
  fm="ml",
  rotate="oblimin",
  max.iter=100
  )

# get the quality
print(m.ml.orth)

## Factor Analysis using method =  ml
## Call: fa(r = dd, nfactors = 2, n.obs = nrow(dd), rotate = "varimax", 
##     SMC = TRUE, max.iter = 100, fm = "ml")
## Standardized loadings (pattern matrix) based upon correlation matrix
##       ML1   ML2    h2   u2 com
## i1  -0.07  0.56 0.315 0.68 1.0
## i2  -0.74  0.20 0.587 0.41 1.1
## i3  -0.07  0.61 0.379 0.62 1.0
## i4   0.72 -0.05 0.522 0.48 1.0
## i5  -0.10  0.48 0.241 0.76 1.1
## i6   0.72 -0.10 0.531 0.47 1.0
## i7  -0.11  0.28 0.091 0.91 1.3
## i8  -0.66  0.30 0.531 0.47 1.4
## i9  -0.11  0.57 0.340 0.66 1.1
## i10  0.74 -0.20 0.594 0.41 1.1
## i11 -0.08  0.51 0.268 0.73 1.0
## i12  0.70 -0.08 0.497 0.50 1.0
## 
##                        ML1  ML2
## SS loadings           3.12 1.77
## Proportion Var        0.26 0.15
## Cumulative Var        0.26 0.41
## Proportion Explained  0.64 0.36
## Cumulative Proportion 0.64 1.00
## 
## Mean item complexity =  1.1
## Test of the hypothesis that 2 factors are sufficient.
## 
## The degrees of freedom for the null model are  66  and the objective function was  3.98 with Chi Square of  374.64
## The degrees of freedom for the model are 43  and the objective function was  0.55 
## 
## The root mean square of the residuals (RMSR) is  0.05 
## The df corrected root mean square of the residuals is  0.06 
## 
## The harmonic number of observations is  100 with the empirical chi square  32.4  with prob <  0.88 
## The total number of observations was  100  with Likelihood Chi Square =  50.99  with prob <  0.19 
## 
## Tucker Lewis Index of factoring reliability =  0.96
## RMSEA index =  0.042  and the 90 % confidence intervals are  0 0.084
## BIC =  -147.03
## Fit based upon off diagonal values = 0.97
## Measures of factor score adequacy             
##                                                    ML1  ML2
## Correlation of (regression) scores with factors   0.93 0.83
## Multiple R square of scores with factors          0.86 0.70
## Minimum correlation of possible factor scores     0.71 0.39

print(m.ml.obl)

## Factor Analysis using method =  ml
## Call: fa(r = dd, nfactors = 2, n.obs = nrow(dd), rotate = "oblimin", 
##     SMC = TRUE, max.iter = 100, fm = "ml")
## Standardized loadings (pattern matrix) based upon correlation matrix
##       ML1   ML2    h2   u2 com
## i1   0.01  0.57 0.315 0.68 1.0
## i2  -0.75  0.05 0.587 0.41 1.0
## i3   0.02  0.62 0.379 0.62 1.0
## i4   0.75  0.10 0.522 0.48 1.0
## i5  -0.03  0.48 0.241 0.76 1.0
## i6   0.74  0.05 0.531 0.47 1.0
## i7  -0.08  0.27 0.091 0.91 1.2
## i8  -0.65  0.18 0.531 0.47 1.1
## i9  -0.04  0.57 0.340 0.66 1.0
## i10  0.75 -0.05 0.594 0.41 1.0
## i11 -0.01  0.52 0.268 0.73 1.0
## i12  0.72  0.06 0.497 0.50 1.0
## 
##                        ML1  ML2
## SS loadings           3.22 1.68
## Proportion Var        0.27 0.14
## Cumulative Var        0.27 0.41
## Proportion Explained  0.66 0.34
## Cumulative Proportion 0.66 1.00
## 
##  With factor correlations of 
##       ML1   ML2
## ML1  1.00 -0.33
## ML2 -0.33  1.00
## 
## Mean item complexity =  1
## Test of the hypothesis that 2 factors are sufficient.
## 
## The degrees of freedom for the null model are  66  and the objective function was  3.98 with Chi Square of  374.64
## The degrees of freedom for the model are 43  and the objective function was  0.55 
## 
## The root mean square of the residuals (RMSR) is  0.05 
## The df corrected root mean square of the residuals is  0.06 
## 
## The harmonic number of observations is  100 with the empirical chi square  32.4  with prob <  0.88 
## The total number of observations was  100  with Likelihood Chi Square =  50.99  with prob <  0.19 
## 
## Tucker Lewis Index of factoring reliability =  0.96
## RMSEA index =  0.042  and the 90 % confidence intervals are  0 0.084
## BIC =  -147.03
## Fit based upon off diagonal values = 0.97
## Measures of factor score adequacy             
##                                                    ML1  ML2
## Correlation of (regression) scores with factors   0.94 0.85
## Multiple R square of scores with factors          0.88 0.72
## Minimum correlation of possible factor scores     0.76 0.44

# we might see the structure clearer by suppressing the output of low loadings by modifying cutoff, default is .1
print(m.ml.orth$loadings, cutoff=.2)

## 
## Loadings:
##     ML1    ML2   
## i1          0.557
## i2  -0.740       
## i3          0.612
## i4   0.721       
## i5          0.481
## i6   0.722       
## i7          0.280
## i8  -0.663  0.302
## i9          0.572
## i10  0.744 -0.201
## i11         0.511
## i12  0.700       
## 
##                  ML1   ML2
## SS loadings    3.125 1.773
## Proportion Var 0.260 0.148
## Cumulative Var 0.260 0.408

print(m.ml.obl$loadings, cutoff=.2)

## 
## Loadings:
##     ML1    ML2   
## i1          0.566
## i2  -0.748       
## i3          0.622
## i4   0.749       
## i5          0.479
## i6   0.744       
## i7          0.268
## i8  -0.652       
## i9          0.570
## i10  0.751       
## i11         0.515
## i12  0.722       
## 
##                  ML1   ML2
## SS loadings    3.194 1.650
## Proportion Var 0.266 0.137
## Cumulative Var 0.266 0.404

Die Auswirkungen auf die Ladungsmatrix bei orthogonaler bzw. obliquer Rotation, also deren Unterschiede, sind marginal.

Die Behandlung der EI Daten als Basis einer PCA findet sich in den Beispielen.

Die Behandlung der EI Daten als Basis einer CFA findet sich bei CFA bzw. ebenfalls in den Beispielen.

Weitere Beispiele, Übungen / Exercises

Siehe Beispiele, Werner.

Siehe Beispiele, crime. Ein weiterer Vergleich - diesmal eher zur Unterschiedlichkeit der Pakete Maximum-Likelihood FA mit den crime-Daten aus Everitt (2010)

Siehe Beispiele, neo_ct. Neo FFI und die Identifizierbarkeit der Big Five Faktoren in einem Datensatz

Vollziehen Sie die Beispiele der beiden empfohlenen Lehrbücher nach.

Vollziehen Sie die Übungsaufgaben mit Lösungen der beiden empfohlenen Lehrbücher nach.

Studierenden STAI Daten

[https://md.psych.bio.uni-goettingen.de/data/div/stud_stai_items_utf8.txt]

Überprüfen Sie das Konzept von State- und Trait-Angst.

Links und Referenzen

lme4 Dokumentation [http://cran.r-project.org/web/packages/lme4/lme4.pdf]

R-Bloggers Tutorial John Quick [http://www.r-bloggers.com/r-tutorial-series-exploratory-factor-analysis/]

Tutorial Wollschläger: [http://www.uni-kiel.de/psychologie/rexrepos/posts/multFA.html]

Field (2012, p. 112) Kapitel 3: R Environment

Version: 30 Juni, 2022 11:26

Explorative Faktoranalyse

M.Psy.205

Peter Zezula (pzezula@uni-goettingen.de)

Faktoranalyse - Explorative FA Rmd

Ein paar prinzipielle Bemerkungen

Field (2012, p. 755)

Orthogonale Rotation

Oblique Rotation

Rotation in R

Faktorladungen

Faktorwerte

Eigenwerte der Hauptkomponenten

Eigenvektor und Loadings

Kommunalität

Einfachstruktur

Reduzierte Korrelationsmatrix als Ausgangsmatrix bei FA

Chi^2 Test ob die Anzahl der Faktoren ausreicht

Faktoranalyse in R Base

Beipiel: Wahrnehmung eigener und fremder Gefühle (Emotionale Intelligenzforschung)

Reduzierte Korrelationsmatrix als Ausgangsmatrix bei FA

oblique Rotation

Weitere Beispiele, Übungen / Exercises

Studierenden STAI Daten

Links und Referenzen