Generell sind varianzanalytische Messwiederholungs- und Mixed-Designs eine Teilmenge der Multi-Level-Modelle (MLM). Wegen der deutlich höheren Flexibilität und Mächtigkeit der MLM empfehlen wir generell deren Einsatz.
Eine komfortable Art für “klassische” varianzanalytische
Messwiederholungs- und Mixed-Designs bieten Meta-Packages wie
library(ez), die pipe-friendly library(rstatx)
oder `library(afex) mit lme4 Inclusion und flexiblen Effektstärken
etc.
Voraussetzungen: Zusätzlich zu Varianzenhomogenität über die Faktorstufen kommt noch die Homogenität der Covarianzen (Korrelationen)
Homogenität der Covarianzen wird getestet mit Mauchly’s Test (bei Varianzhomogenität war es Levene-Test). Spherizität (Gleichheit der Varianzen der Differenzen zwischen den Faktorstufen)
Bei Verletzung der Homogenität der Covarianzen:
Alternatives, inzwischen bevorzugtes Vorgehen
Das Verständnis von long und wide als unterschiedliches Format für dieselben Daten wird vorausgesetzt.
Beschreibung des Datensatzes in den Beispielen unter STAI Exam.
Ausgabe einer ANOVA Tabelle. Mauchly’s Test for Sphericity sowie die Greenhouse-Geisser und die Huyn-Feldt Korrektur werden mit ausgegeben. Bei Messwiederholungsdesigns braucht ezANOVA das Long-Format.
# get and prepare data
dd <- read.delim("http://md.psych.bio.uni-goettingen.de/mv/data/virt/v_stait_exam_wide.txt")
dd <- dd[dd$group == 'av',c('subj', 'neutr', 'exam1', 'exam2')]
# convert it to long format using library(reshape2)
# require(reshape2)
# dd.l <- melt(dd, id=c('subj'), variable.name="sit", value.name="anx", measured=c('neutr', 'exam1', 'exam2'))
# head(dd.l)
# rm(dd.l)
# convert it to long format using tidyr
require(tidyr)
## Lade nötiges Paket: tidyr
#dd.l <- dd %>% tidyr::gather(key=sit, value=anx, neutr, exam1, exam2)
dd.l <- dd %>% tidyr::pivot_longer(names_to="sit", values_to="anx", cols=neutr:exam2)
dd.l$subj <- factor(dd.l$subj)
dd.l$sit <- factor(dd.l$sit)
head(dd.l)
## # A tibble: 6 × 3
## subj sit anx
## <fct> <fct> <int>
## 1 1 neutr 42
## 2 1 exam1 52
## 3 1 exam2 48
## 4 2 neutr 59
## 5 2 exam1 68
## 6 2 exam2 65
#require(Rcmdr)
require(ez)
## Lade nötiges Paket: ez
# fit model using ezANOVA
m.f <- ezANOVA(data=dd.l, dv=.(anx), wid = .(subj), within = .(sit), detailed=TRUE, type=3)
# having one variable only we can use simply the name of a variable (without quoting it)
m.f <- ezANOVA(data=dd.l, dv=anx, wid = subj, within = sit, detailed=TRUE, return_aov=T)
# parameter
m.f
## $ANOVA
## Effect DFn DFd SSn SSd F p p<.05
## 1 (Intercept) 1 59 411366.806 27728.1944 875.3055 4.355834e-37 *
## 2 sit 2 118 1682.544 551.4556 180.0147 1.423028e-36 *
## ges
## 1 0.93567638
## 2 0.05615558
##
## $`Mauchly's Test for Sphericity`
## Effect W p p<.05
## 2 sit 0.07058729 4.102687e-34 *
##
## $`Sphericity Corrections`
## Effect GGe p[GG] p[GG]<.05 HFe p[HF] p[HF]<.05
## 2 sit 0.5182924 3.368231e-20 * 0.5192507 3.12436e-20 *
##
## $aov
##
## Call:
## aov(formula = formula(aov_formula), data = data)
##
## Grand Mean: 47.80556
##
## Stratum 1: subj
##
## Terms:
## Residuals
## Sum of Squares 27728.19
## Deg. of Freedom 59
##
## Residual standard error: 21.67878
##
## Stratum 2: subj:sit
##
## Terms:
## sit Residuals
## Sum of Squares 1682.5444 551.4556
## Deg. of Freedom 2 118
##
## Residual standard error: 2.161794
## Estimated effects may be unbalanced
summary(m.f$aov)
##
## Error: subj
## Df Sum Sq Mean Sq F value Pr(>F)
## Residuals 59 27728 470
##
## Error: subj:sit
## Df Sum Sq Mean Sq F value Pr(>F)
## sit 2 1682.5 841.3 180 <2e-16 ***
## Residuals 118 551.5 4.7
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#ezANOVA(data=dd.l, dv=.(anx), wid = .(subj), within = .(sit), detailed=TRUE, type=3)
# ez offers ezPlot()
ezPlot(data=dd.l, dv=.(anx), wid = .(subj), within = .(sit), x=sit)
# we look at descriptives
require(Rmisc)
## Lade nötiges Paket: Rmisc
## Lade nötiges Paket: lattice
## Lade nötiges Paket: plyr
summarySE(data=dd.l, "anx", c("sit"))
## sit N anx sd se ci
## 1 exam1 60 49.86667 13.32192 1.719852 3.441416
## 2 exam2 60 50.06667 12.22649 1.578433 3.158437
## 3 neutr 60 43.48333 12.34324 1.593506 3.188598
detach("package:Rmisc", unload=TRUE)
# create graphs
require(ggplot2)
## Lade nötiges Paket: ggplot2
ggplot(dd.l, aes(x=sit, y=anx)) +
# define colors
scale_fill_manual(values=c("#FFFFFF", "#CCCCCC", '#AAAAAA')) +
# create white background etc
#theme_bw() +
geom_point() +
# add verbose axis labels
labs(x = "Anxiety-Group", y = "STAI-Trait")
ggplot(dd.l, aes(x=sit, y=anx, fill=sit)) +
# define colors
scale_fill_manual(values=c("#FFFFFF", "#CCCCCC", '#AAAAAA')) +
# create white background etc
#theme_bw() +
#stat_summary(fun.y=mean, geom="bar", position=position_dodge(width=0.7), colour='black') +
stat_summary(fun.y=mean, geom="bar", colour='black') +
# add error information
#stat_summary(fun.data = mean_cl_normal, geom = "errorbar", position=position_dodge(width=0.7), width=0.2) +
stat_summary(fun.data = mean_cl_normal, geom = "errorbar", width=0.2) +
# put big points on top of bars
#stat_summary(fun.data = mean_cl_normal, geom = "point", shape=21, size=5, position=position_dodge(width=0.7)) +
stat_summary(fun.data = mean_cl_normal, geom = "point", shape=21, size=5) +
# add verbose axis labels
labs(x = "Anxiety-Group", y = "STAI-Trait")
## Warning: `fun.y` is deprecated. Use `fun` instead.
Über den Parameter return_aov bei ezANOVA()
können wir ein aov-Objekt speichern, in dem viele Informationen mehr als
in der Anova-Tabelle enthalten sind. Beispielsweise finden sich dort die
ganzen Quadratsummen, auch die verschiedenen Residual-SS.
Wir können über den Befehl ezStats() deskriptive
Statistiken anfordern.
ezPlot() ist ein Wrapper für
library(ggplot2) eignet sich, um einfach Grafiken
varianzanalytischer Designs zu erstellen.
# get and prepare data
dd <- read.delim("http://md.psych.bio.uni-goettingen.de/mv/data/virt/v_stait_exam_wide.txt")
dd <- dd[dd$group == 'av',c('subj', 'neutr', 'exam1', 'exam2')]
# convert it to long format
#dd.l <- melt(dd, id=c('subj'), variable.name="sit", value.name="anx", measured=c('neutr', 'exam1', 'exam2'))
# convert it to long format using tidyr
require(tidyr)
# dd.l <- dd %>% tidyr::gather(key=sit, value=anx, neutr, exam1, exam2)
dd.l <- dd %>% tidyr::pivot_longer(names_to="sit", values_to="anx", cols=neutr:exam2)
dd.l$subj <- factor(dd.l$subj)
dd.l$sit <- factor(dd.l$sit)
head(dd.l)
## # A tibble: 6 × 3
## subj sit anx
## <fct> <fct> <int>
## 1 1 neutr 42
## 2 1 exam1 52
## 3 1 exam2 48
## 4 2 neutr 59
## 5 2 exam1 68
## 6 2 exam2 65
#require(Rcmdr)
require(ez)
# by default, ezANOVA outputs a anova table
ezANOVA(data=dd.l, dv=.(anx), wid = .(subj), within = .(sit), detailed=TRUE, type=3)
## $ANOVA
## Effect DFn DFd SSn SSd F p p<.05
## 1 (Intercept) 1 59 411366.806 27728.1944 875.3055 4.355834e-37 *
## 2 sit 2 118 1682.544 551.4556 180.0147 1.423028e-36 *
## ges
## 1 0.93567638
## 2 0.05615558
##
## $`Mauchly's Test for Sphericity`
## Effect W p p<.05
## 2 sit 0.07058729 4.102687e-34 *
##
## $`Sphericity Corrections`
## Effect GGe p[GG] p[GG]<.05 HFe p[HF] p[HF]<.05
## 2 sit 0.5182924 3.368231e-20 * 0.5192507 3.12436e-20 *
# ezANOVA can return an aov() object with flag return_aov = T
m.ez <- ezANOVA(data=dd.l, dv=.(anx), wid = .(subj), within = .(sit), detailed=TRUE, type=3, return_aov = TRUE)
m.ez
## $ANOVA
## Effect DFn DFd SSn SSd F p p<.05
## 1 (Intercept) 1 59 411366.806 27728.1944 875.3055 4.355834e-37 *
## 2 sit 2 118 1682.544 551.4556 180.0147 1.423028e-36 *
## ges
## 1 0.93567638
## 2 0.05615558
##
## $`Mauchly's Test for Sphericity`
## Effect W p p<.05
## 2 sit 0.07058729 4.102687e-34 *
##
## $`Sphericity Corrections`
## Effect GGe p[GG] p[GG]<.05 HFe p[HF] p[HF]<.05
## 2 sit 0.5182924 3.368231e-20 * 0.5192507 3.12436e-20 *
##
## $aov
##
## Call:
## aov(formula = formula(aov_formula), data = data)
##
## Grand Mean: 47.80556
##
## Stratum 1: subj
##
## Terms:
## Residuals
## Sum of Squares 27728.19
## Deg. of Freedom 59
##
## Residual standard error: 21.67878
##
## Stratum 2: subj:sit
##
## Terms:
## sit Residuals
## Sum of Squares 1682.5444 551.4556
## Deg. of Freedom 2 118
##
## Residual standard error: 2.161794
## Estimated effects may be unbalanced
# summary shows, what parts are available
summary(m.ez)
## Length Class Mode
## ANOVA 9 data.frame list
## Mauchly's Test for Sphericity 4 data.frame list
## Sphericity Corrections 7 data.frame list
## aov 3 aovlist list
# we can get a aov summary
summary(m.ez$aov)
##
## Error: subj
## Df Sum Sq Mean Sq F value Pr(>F)
## Residuals 59 27728 470
##
## Error: subj:sit
## Df Sum Sq Mean Sq F value Pr(>F)
## sit 2 1682.5 841.3 180 <2e-16 ***
## Residuals 118 551.5 4.7
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# we can access special parts by number or name
# get anova table
m.ez[1]
## $ANOVA
## Effect DFn DFd SSn SSd F p p<.05
## 1 (Intercept) 1 59 411366.806 27728.1944 875.3055 4.355834e-37 *
## 2 sit 2 118 1682.544 551.4556 180.0147 1.423028e-36 *
## ges
## 1 0.93567638
## 2 0.05615558
# get aov object
m.ez['aov']
## $aov
##
## Call:
## aov(formula = formula(aov_formula), data = data)
##
## Grand Mean: 47.80556
##
## Stratum 1: subj
##
## Terms:
## Residuals
## Sum of Squares 27728.19
## Deg. of Freedom 59
##
## Residual standard error: 21.67878
##
## Stratum 2: subj:sit
##
## Terms:
## sit Residuals
## Sum of Squares 1682.5444 551.4556
## Deg. of Freedom 2 118
##
## Residual standard error: 2.161794
## Estimated effects may be unbalanced
# ... or using m.ez$aov
# now we can take a look at the model matrix contrasts
head(unique(model.matrix(m.ez$aov)))
## (Intercept) sitexam2 sitneutr Error(subj/(sit))2 Error(subj/(sit))3
## 1 1 0 0 0 0
## 2 1 1 0 0 0
## 3 1 0 1 0 0
## 4 1 0 0 1 0
## 5 1 1 0 1 0
## 6 1 0 1 1 0
## Error(subj/(sit))4 Error(subj/(sit))5 Error(subj/(sit))6 Error(subj/(sit))7
## 1 0 0 0 0
## 2 0 0 0 0
## 3 0 0 0 0
## 4 0 0 0 0
## 5 0 0 0 0
## 6 0 0 0 0
## Error(subj/(sit))8 Error(subj/(sit))9 Error(subj/(sit))10 Error(subj/(sit))11
## 1 0 0 0 0
## 2 0 0 0 0
## 3 0 0 0 0
## 4 0 0 0 0
## 5 0 0 0 0
## 6 0 0 0 0
## Error(subj/(sit))12 Error(subj/(sit))13 Error(subj/(sit))14
## 1 0 0 0
## 2 0 0 0
## 3 0 0 0
## 4 0 0 0
## 5 0 0 0
## 6 0 0 0
## Error(subj/(sit))15 Error(subj/(sit))16 Error(subj/(sit))17
## 1 0 0 0
## 2 0 0 0
## 3 0 0 0
## 4 0 0 0
## 5 0 0 0
## 6 0 0 0
## Error(subj/(sit))18 Error(subj/(sit))19 Error(subj/(sit))20
## 1 0 0 0
## 2 0 0 0
## 3 0 0 0
## 4 0 0 0
## 5 0 0 0
## 6 0 0 0
## Error(subj/(sit))21 Error(subj/(sit))22 Error(subj/(sit))23
## 1 0 0 0
## 2 0 0 0
## 3 0 0 0
## 4 0 0 0
## 5 0 0 0
## 6 0 0 0
## Error(subj/(sit))24 Error(subj/(sit))25 Error(subj/(sit))26
## 1 0 0 0
## 2 0 0 0
## 3 0 0 0
## 4 0 0 0
## 5 0 0 0
## 6 0 0 0
## Error(subj/(sit))27 Error(subj/(sit))28 Error(subj/(sit))29
## 1 0 0 0
## 2 0 0 0
## 3 0 0 0
## 4 0 0 0
## 5 0 0 0
## 6 0 0 0
## Error(subj/(sit))30 Error(subj/(sit))31 Error(subj/(sit))32
## 1 0 0 0
## 2 0 0 0
## 3 0 0 0
## 4 0 0 0
## 5 0 0 0
## 6 0 0 0
## Error(subj/(sit))33 Error(subj/(sit))34 Error(subj/(sit))35
## 1 0 0 0
## 2 0 0 0
## 3 0 0 0
## 4 0 0 0
## 5 0 0 0
## 6 0 0 0
## Error(subj/(sit))36 Error(subj/(sit))37 Error(subj/(sit))38
## 1 0 0 0
## 2 0 0 0
## 3 0 0 0
## 4 0 0 0
## 5 0 0 0
## 6 0 0 0
## Error(subj/(sit))39 Error(subj/(sit))40 Error(subj/(sit))41
## 1 0 0 0
## 2 0 0 0
## 3 0 0 0
## 4 0 0 0
## 5 0 0 0
## 6 0 0 0
## Error(subj/(sit))42 Error(subj/(sit))43 Error(subj/(sit))44
## 1 0 0 0
## 2 0 0 0
## 3 0 0 0
## 4 0 0 0
## 5 0 0 0
## 6 0 0 0
## Error(subj/(sit))45 Error(subj/(sit))46 Error(subj/(sit))47
## 1 0 0 0
## 2 0 0 0
## 3 0 0 0
## 4 0 0 0
## 5 0 0 0
## 6 0 0 0
## Error(subj/(sit))48 Error(subj/(sit))49 Error(subj/(sit))50
## 1 0 0 0
## 2 0 0 0
## 3 0 0 0
## 4 0 0 0
## 5 0 0 0
## 6 0 0 0
## Error(subj/(sit))51 Error(subj/(sit))52 Error(subj/(sit))53
## 1 0 0 0
## 2 0 0 0
## 3 0 0 0
## 4 0 0 0
## 5 0 0 0
## 6 0 0 0
## Error(subj/(sit))54 Error(subj/(sit))55 Error(subj/(sit))56
## 1 0 0 0
## 2 0 0 0
## 3 0 0 0
## 4 0 0 0
## 5 0 0 0
## 6 0 0 0
## Error(subj/(sit))57 Error(subj/(sit))58 Error(subj/(sit))59
## 1 0 0 0
## 2 0 0 0
## 3 0 0 0
## 4 0 0 0
## 5 0 0 0
## 6 0 0 0
## Error(subj/(sit))60
## 1 0
## 2 0
## 3 0
## 4 0
## 5 0
## 6 0
# ez also offers ezStats(). We can get the means and descriptives
ezStats(data=dd.l, dv=.(anx), wid = .(subj), within = .(sit), type=3)
## sit N Mean SD FLSD
## 1 exam1 60 49.86667 13.32192 0.7815892
## 2 exam2 60 50.06667 12.22649 0.7815892
## 3 neutr 60 43.48333 12.34324 0.7815892
# ez offers ezPlot() to visualize effects
ezPlot(data=dd.l, dv=.(anx), wid = .(subj), within = .(sit), x=.(sit))
lme4::lmer()Mit lmer() aus der library(lme4) wird ein
Multi-Level-Modell angepasst. package(lmerTest) baut auf
package(lme4) auf, fügt aber Signifikanztests hinzu.
Näheres zu Multi-Level-Modellen findet sich in der Unit ml.
Alternativen sind aov(), car::Anova() über
spezielle Syntax zur Angabe der entsprechenden Error-Terme (für den sog.
Versuchspersonen-Faktor).
# get data
#require(Rcmdr)
require(ggplot2)
require(reshape2)
## Lade nötiges Paket: reshape2
##
## Attache Paket: 'reshape2'
## Das folgende Objekt ist maskiert 'package:tidyr':
##
## smiths
# require(lme4) # is loaded and used by lmerTest
require(lmerTest)
## Lade nötiges Paket: lmerTest
## Lade nötiges Paket: lme4
## Lade nötiges Paket: Matrix
##
## Attache Paket: 'Matrix'
## Die folgenden Objekte sind maskiert von 'package:tidyr':
##
## expand, pack, unpack
##
## Attache Paket: 'lmerTest'
## Das folgende Objekt ist maskiert 'package:lme4':
##
## lmer
## Das folgende Objekt ist maskiert 'package:stats':
##
## step
dd <- read.delim("http://md.psych.bio.uni-goettingen.de/mv/data/virt/v_stait_exam_wide.txt")
dd <- dd[dd$group == 'av',c('subj', 'neutr', 'exam1', 'exam2')]
head(dd)
## subj neutr exam1 exam2
## 1 1 42 52 48
## 2 2 59 68 65
## 3 3 49 59 55
## 4 4 20 31 27
## 5 5 45 55 51
## 6 6 39 49 45
# convert it to long format using tidyr
require(tidyr)
#dd.r <- dd %>% tidyr::gather(key=sit, value=anx, neutr, exam1, exam2)
dd.r <- dd %>% tidyr::pivot_longer(names_to="sit", values_to="anx", cols=neutr:exam2)
dd.r$subj <- factor(dd.r$subj)
dd.r$sit <- factor(dd.r$sit)
head(dd.r)
## # A tibble: 6 × 3
## subj sit anx
## <fct> <fct> <int>
## 1 1 neutr 42
## 2 1 exam1 52
## 3 1 exam2 48
## 4 2 neutr 59
## 5 2 exam1 68
## 6 2 exam2 65
#psych::describe(dd)
require(Rmisc)
## Lade nötiges Paket: Rmisc
Rmisc::summarySE(data=dd.r, "anx", groupvars="sit")
## sit N anx sd se ci
## 1 exam1 60 49.86667 13.32192 1.719852 3.441416
## 2 exam2 60 50.06667 12.22649 1.578433 3.158437
## 3 neutr 60 43.48333 12.34324 1.593506 3.188598
# fit model
# random intercept model, no interaction
m.f <- lmerTest::lmer(anx ~ sit + (1|subj), data=dd.r)
# parameter
m.f
## Linear mixed model fit by REML ['lmerModLmerTest']
## Formula: anx ~ sit + (1 | subj)
## Data: dd.r
## REML criterion at convergence: 1059.536
## Random effects:
## Groups Name Std.Dev.
## subj (Intercept) 12.454
## Residual 2.162
## Number of obs: 180, groups: subj, 60
## Fixed Effects:
## (Intercept) sitexam2 sitneutr
## 49.867 0.200 -6.383
summary(m.f)
## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: anx ~ sit + (1 | subj)
## Data: dd.r
##
## REML criterion at convergence: 1059.5
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -1.82249 -0.66888 -0.06843 0.63421 1.35064
##
## Random effects:
## Groups Name Variance Std.Dev.
## subj (Intercept) 155.099 12.454
## Residual 4.673 2.162
## Number of obs: 180, groups: subj, 60
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) 49.8667 1.6318 61.3580 30.559 <2e-16 ***
## sitexam2 0.2000 0.3947 118.0000 0.507 0.613
## sitneutr -6.3833 0.3947 118.0000 -16.173 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr) sitxm2
## sitexam2 -0.121
## sitneutr -0.121 0.500
# test the model
anova(m.f)
## Type III Analysis of Variance Table with Satterthwaite's method
## Sum Sq Mean Sq NumDF DenDF F value Pr(>F)
## sit 1682.5 841.27 2 118 180.01 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
unique(model.matrix(m.f))
## (Intercept) sitexam2 sitneutr
## 1 1 0 1
## 2 1 0 0
## 3 1 1 0
# # descriptives - the Rcmdr way
# tapply(dd.r$anx, list(situation=dd.r$sit), mean,
# na.rm=TRUE) # means
# #tapply(dd.r$anx, list(situation=dd.r$sit), sd, na.rm=TRUE)
# # std. deviations
# tapply(dd.r$anx, list(situation=dd.r$sit), function(x)
# sum(!is.na(x))) # counts
# # the psych package way
# by(dd.r$anx, list(dd.r$sit), psych::describe)
plot(m.f)
ggplot(dd.r, aes(x=sit, y=anx)) +
# define colors
scale_fill_manual(values=c("#FFFFFF", "#CCCCCC", '#AAAAAA')) +
# create white background etc
#theme_bw() +
geom_point() +
# add verbose axis labels
labs(x = "Anxiety-Group", y = "STAI-Trait")
ggplot(dd.r, aes(x=sit, y=anx, fill=sit)) +
# define colors
scale_fill_manual(values=c("#FFFFFF", "#CCCCCC", '#AAAAAA')) +
# create white background etc
#theme_bw() +
stat_summary(fun.y=mean, geom="bar", position=position_dodge(width=0.7), colour='black') +
# add error information
stat_summary(fun.data = mean_cl_normal, geom = "errorbar", position=position_dodge(width=0.7), width=0.2) +
# put big points on top of bars
#stat_summary(fun.data = mean_cl_normal, geom = "point", shape=21, size=5, position=position_dodge(width=0.7), width=0.2) +
stat_summary(fun.data = mean_cl_normal, geom = "point", shape=21, size=5, position=position_dodge(width=0.7)) +
# add verbose axis labels
labs(x = "Anxiety-Group", y = "STAI-Trait")
## Warning: `fun.y` is deprecated. Use `fun` instead.
ggplot(dd.r, aes(x=sit, y=anx, fill=sit)) +
# define colors
scale_fill_manual(values=c("#FFFFFF", "#CCCCCC", '#AAAAAA')) +
stat_summary(fun.y=mean, geom="bar", colour='black') +
# add error information
stat_summary(fun.data = mean_cl_normal, geom = "errorbar", width=0.2) +
# put big points on top of bars
stat_summary(fun.data = mean_cl_normal, geom = "point", shape=21, size=5) +
# add verbose axis labels
labs(x = "Anxiety-Group", y = "STAI-Trait")
## Warning: `fun.y` is deprecated. Use `fun` instead.
nlme::lme()Alternativ zum package(lme4) können wir auch mit dem
package(nlme) ein Mehrebenenmodell anpassen. Mit
lme() aus der library(nlme) wird ein
Multi-Level-Modell angepasst.
Näheres zu Multi-Level-Modellen findet sich in der Unit ml.
# get data
dd <- read.delim("http://md.psych.bio.uni-goettingen.de/mv/data/virt/v_stait_exam_wide.txt")
dd <- dd[dd$group == 'av',c('subj', 'neutr', 'exam1', 'exam2')]
head(dd)
## subj neutr exam1 exam2
## 1 1 42 52 48
## 2 2 59 68 65
## 3 3 49 59 55
## 4 4 20 31 27
## 5 5 45 55 51
## 6 6 39 49 45
# convert it to long format using tidyr
require(tidyr)
#dd.r <- dd %>% tidyr::gather(key=sit, value=anx, neutr, exam1, exam2)
dd.r <- dd %>% tidyr::pivot_longer(names_to="sit", values_to="anx", cols=neutr:exam2)
dd.r$subj <- factor(dd.r$subj)
dd.r$sit <- factor(dd.r$sit)
head(dd.r)
## # A tibble: 6 × 3
## subj sit anx
## <fct> <fct> <int>
## 1 1 neutr 42
## 2 1 exam1 52
## 3 1 exam2 48
## 4 2 neutr 59
## 5 2 exam1 68
## 6 2 exam2 65
#psych::describe(dd)
require(Rmisc)
Rmisc::summarySE(data=dd.r, "anx", groupvars="sit")
## sit N anx sd se ci
## 1 exam1 60 49.86667 13.32192 1.719852 3.441416
## 2 exam2 60 50.06667 12.22649 1.578433 3.158437
## 3 neutr 60 43.48333 12.34324 1.593506 3.188598
# fit model
require(nlme)
## Lade nötiges Paket: nlme
##
## Attache Paket: 'nlme'
## Das folgende Objekt ist maskiert 'package:lme4':
##
## lmList
# random intercept model, no interaction
m.f <- nlme::lme(fixed=anx ~ sit, random = ~1 |subj, data=dd.r)
# parameter
m.f
## Linear mixed-effects model fit by REML
## Data: dd.r
## Log-restricted-likelihood: -529.7681
## Fixed: anx ~ sit
## (Intercept) sitexam2 sitneutr
## 49.866667 0.200000 -6.383333
##
## Random effects:
## Formula: ~1 | subj
## (Intercept) Residual
## StdDev: 12.45386 2.161794
##
## Number of Observations: 180
## Number of Groups: 60
summary(m.f)
## Linear mixed-effects model fit by REML
## Data: dd.r
## AIC BIC logLik
## 1069.536 1085.417 -529.7681
##
## Random effects:
## Formula: ~1 | subj
## (Intercept) Residual
## StdDev: 12.45386 2.161794
##
## Fixed effects: anx ~ sit
## Value Std.Error DF t-value p-value
## (Intercept) 49.86667 1.6318294 118 30.55875 0.0000
## sitexam2 0.20000 0.3946877 118 0.50673 0.6133
## sitneutr -6.38333 0.3946877 118 -16.17312 0.0000
## Correlation:
## (Intr) sitxm2
## sitexam2 -0.121
## sitneutr -0.121 0.500
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -1.82249125 -0.66887831 -0.06842733 0.63420849 1.35064390
##
## Number of Observations: 180
## Number of Groups: 60
# test the model
anova(m.f)
## numDF denDF F-value p-value
## (Intercept) 1 118 875.3055 <.0001
## sit 2 118 180.0147 <.0001
#unique(model.matrix(m.f))
# # descriptives - the Rcmdr way
# tapply(dd.r$anx, list(situation=dd.r$sit), mean,
# na.rm=TRUE) # means
# #tapply(dd.r$anx, list(situation=dd.r$sit), sd, na.rm=TRUE)
# # std. deviations
# tapply(dd.r$anx, list(situation=dd.r$sit), function(x)
# sum(!is.na(x))) # counts
# # the psych package way
# by(dd.r$anx, list(dd.r$sit), psych::describe)
plot(m.f)
ggplot(dd.r, aes(x=sit, y=anx)) +
# define colors
scale_fill_manual(values=c("#FFFFFF", "#CCCCCC", '#AAAAAA')) +
# create white background etc
#theme_bw() +
geom_point() +
# add verbose axis labels
labs(x = "Anxiety-Group", y = "STAI-Trait")
ggplot(dd.r, aes(x=sit, y=anx, fill=sit)) +
# define colors
scale_fill_manual(values=c("#FFFFFF", "#CCCCCC", '#AAAAAA')) +
# create white background etc
#theme_bw() +
stat_summary(fun.y=mean, geom="bar", position=position_dodge(width=0.7), colour='black') +
# add error information
stat_summary(fun.data = mean_cl_normal, geom = "errorbar", position=position_dodge(width=0.7), width=0.2) +
# put big points on top of bars
#stat_summary(fun.data = mean_cl_normal, geom = "point", shape=21, size=5, position=position_dodge(width=0.7), width=0.2) +
stat_summary(fun.data = mean_cl_normal, geom = "point", shape=21, size=5, position=position_dodge(width=0.7)) +
# add verbose axis labels
labs(x = "Anxiety-Group", y = "STAI-Trait")
## Warning: `fun.y` is deprecated. Use `fun` instead.
ggplot(dd.r, aes(x=sit, y=anx, fill=sit)) +
# define colors
scale_fill_manual(values=c("#FFFFFF", "#CCCCCC", '#AAAAAA')) +
stat_summary(fun.y=mean, geom="bar", colour='black') +
# add error information
stat_summary(fun.data = mean_cl_normal, geom = "errorbar", width=0.2) +
# put big points on top of bars
stat_summary(fun.data = mean_cl_normal, geom = "point", shape=21, size=5) +
# add verbose axis labels
labs(x = "Anxiety-Group", y = "STAI-Trait")
## Warning: `fun.y` is deprecated. Use `fun` instead.
Hier soll belegt werden, dass sich Effekte früher zeigen, wenn ein Within-Design möglich ist. In den Beispielen werden dieselben Daten verwendet, es wird aber einmal so getan, als seien die Personen mehrfach unter verschiedenen Bedingungen gemessen worden (within), im anderen Fall so, als seien in den verschiedenen Bedingungen unterschiedliche Personen gemessen worden (between).
Vergleich derselben Daten, einmal behandelt als Zwischengruppenvergleich (between subjects design) und einmal als Innergruppenvergleich (within subjects design).
Jede Beobachtung bekommt einen eigenen Intercept (Versuchspersoneneneffekt). Die Residualvarianz nimmt hierdurch ab, weil der individuelle Intercept Varianz bindet, die die Gruppenunterschiede nicht binden. Das Verhältnis der Quadratsumme (Varianz) der Gruppe im Vergleich zur Rest-Quadratsumme (residuals) verändert sich. Der F-Wert ist im within design wegen der Reduktion der Residualvarianz größer, was zu höherer Power führt.
Die Aufteilung der Residualvarianz ist zu sehen im Summary des Modells unter ‘Random effects:’. Diese Anteile summieren sich zur Residualvarianz im Summary des between subjects model. Die verschiedenen resultierenden F-Werte mit zugehörigen p-Werten sind ebenfalls zu finden.
Die Interpretation der Koeffizienten (Gewichte) und ihre Höhe ist in beiden Modellen gleich.
Achtung: Hier wird mit SS Type II gerechnet, nicht mit SS Type I wie
unten (Vergleich via aov(). Bei balancierten Designs sind
die Ergebnisse gleich SS Type I.
# get and prepare data
dd <- read.delim("http://md.psych.bio.uni-goettingen.de/mv/data/virt/v_stait_exam_wide.txt")
dd <- dd[dd$group == 'av',c('subj', 'neutr', 'exam1', 'exam2')]
# convert it to long format
#dd.l <- melt(dd, id=c('subj'), variable.name="sit", value.name="anx", measured=c('neutr', 'exam1', 'exam2'))
require(tidyr)
# dd.l <- dd %>% tidyr::gather(key=sit, value=anx, neutr, exam1, exam2)
dd.l <- dd %>% tidyr::pivot_longer(names_to="sit", values_to="anx", cols=exam1:exam2)
dd.l$subj <- factor(dd.l$subj)
dd.l$sit <- factor(dd.l$sit)
dd.l$obs <- factor(1:length(dd.l$subj))
head(dd.l)
## # A tibble: 6 × 5
## subj neutr sit anx obs
## <fct> <int> <fct> <int> <fct>
## 1 1 42 exam1 52 1
## 2 1 42 exam2 48 2
## 3 2 59 exam1 68 3
## 4 2 59 exam2 65 4
## 5 3 49 exam1 59 5
## 6 3 49 exam2 55 6
# treat it as between subject model, subjects are 'multiplied'
ezANOVA(data=dd.l, dv=.(anx), wid = .(obs), between = .(sit), detailed=TRUE, type=2)
## Coefficient covariances computed by hccm()
## $ANOVA
## Effect DFn DFd SSn SSd F p p<.05 ges
## 1 sit 1 118 1.2 19290.67 0.007340337 0.9318694 6.220238e-05
##
## $`Levene's Test for Homogeneity of Variance`
## DFn DFd SSn SSd F p p<.05
## 1 1 118 48.13333 6621.333 0.857793 0.3562474
# repeated measures design
ezANOVA(data=dd.l, dv=.(anx), wid = .(subj), within = .(sit), detailed=TRUE, type=2)
## $ANOVA
## Effect DFn DFd SSn SSd F p p<.05
## 1 (Intercept) 1 59 299600.1 18828.87 938.7929810 6.250347e-38 *
## 2 sit 1 59 1.2 461.80 0.1533131 6.967997e-01
## ges
## 1 9.395070e-01
## 2 6.220238e-05
Beim Vergleich der beiden Summaries sehen wir sehr deutlich die Veränderung der Quadratsummen (SSn, SSd) (Zähler- und Nenner-Quadratsumme). Die SSn ist in beiden Modellen gleich (1682.54) Im Between-Subjects-Modell ist die SSd, die auch die Unterschiede zwischen den Vpn enthält, viel größer (28279.65) als im Within-Subjects-Modell (551.45). Daher wird auch der F-Wert viel größer bzw. der p-Wert viel kleiner.
# get and prepare data
dd <- read.delim("http://md.psych.bio.uni-goettingen.de/mv/data/virt/v_stait_exam_wide.txt")
dd <- dd[dd$group == 'av',c('subj', 'neutr', 'exam1', 'exam2')]
# convert it to long format
dd.l <- melt(dd, id=c('subj'), variable.name="sit", value.name="anx", measured=c('neutr', 'exam1', 'exam2'))
dd.l$obs <- 1:length(dd.l$subj)
# between subjects model
m.b <- ezANOVA(data=dd.l, dv=.(anx), wid = .(obs), between = .(sit), detailed=TRUE, type=2, return_aov = TRUE)
## Warning: Converting "obs" to factor for ANOVA.
## Coefficient covariances computed by hccm()
# within subjects model
m.r <- ezANOVA(data=dd.l, dv=.(anx), wid = .(subj), within = .(sit), detailed=TRUE, type=2, return_aov = TRUE)
## Warning: Converting "subj" to factor for ANOVA.
# we compare the effects
m.b$ANOVA
## Effect DFn DFd SSn SSd F p p<.05 ges
## 1 sit 2 177 1682.544 28279.65 5.265454 0.006007399 * 0.05615558
m.r$ANOVA
## Effect DFn DFd SSn SSd F p p<.05
## 1 (Intercept) 1 59 411366.806 27728.1944 875.3055 4.355834e-37 *
## 2 sit 2 118 1682.544 551.4556 180.0147 1.423028e-36 *
## ges
## 1 0.93567638
## 2 0.05615558
# ... denominator SS ist way smaller in within design
# ... p value is way smaller
# ... nominator SS is equal
# we can get the model.matrix to check dummy contrasts
unique(model.matrix(m.b['aov']$aov))
## (Intercept) sitexam1 sitexam2
## 1 1 0 0
## 61 1 1 0
## 121 1 0 1
# we can also get model.matrix for the repeated subjects design.
# As it has subj-1 dummy columns to code effect subject, it is big
# columns are:
model.matrix(m.r['aov']$aov)[1,]
## (Intercept) sitexam1 sitexam2 Error(subj/(sit))2
## 1 0 0 0
## Error(subj/(sit))3 Error(subj/(sit))4 Error(subj/(sit))5 Error(subj/(sit))6
## 0 0 0 0
## Error(subj/(sit))7 Error(subj/(sit))8 Error(subj/(sit))9 Error(subj/(sit))10
## 0 0 0 0
## Error(subj/(sit))11 Error(subj/(sit))12 Error(subj/(sit))13 Error(subj/(sit))14
## 0 0 0 0
## Error(subj/(sit))15 Error(subj/(sit))16 Error(subj/(sit))17 Error(subj/(sit))18
## 0 0 0 0
## Error(subj/(sit))19 Error(subj/(sit))20 Error(subj/(sit))21 Error(subj/(sit))22
## 0 0 0 0
## Error(subj/(sit))23 Error(subj/(sit))24 Error(subj/(sit))25 Error(subj/(sit))26
## 0 0 0 0
## Error(subj/(sit))27 Error(subj/(sit))28 Error(subj/(sit))29 Error(subj/(sit))30
## 0 0 0 0
## Error(subj/(sit))31 Error(subj/(sit))32 Error(subj/(sit))33 Error(subj/(sit))34
## 0 0 0 0
## Error(subj/(sit))35 Error(subj/(sit))36 Error(subj/(sit))37 Error(subj/(sit))38
## 0 0 0 0
## Error(subj/(sit))39 Error(subj/(sit))40 Error(subj/(sit))41 Error(subj/(sit))42
## 0 0 0 0
## Error(subj/(sit))43 Error(subj/(sit))44 Error(subj/(sit))45 Error(subj/(sit))46
## 0 0 0 0
## Error(subj/(sit))47 Error(subj/(sit))48 Error(subj/(sit))49 Error(subj/(sit))50
## 0 0 0 0
## Error(subj/(sit))51 Error(subj/(sit))52 Error(subj/(sit))53 Error(subj/(sit))54
## 0 0 0 0
## Error(subj/(sit))55 Error(subj/(sit))56 Error(subj/(sit))57 Error(subj/(sit))58
## 0 0 0 0
## Error(subj/(sit))59 Error(subj/(sit))60
## 0 0
# we look at the relevant parts of the model.matrix to get an impression of subjects dummy coding
model.matrix(m.r['aov']$aov)[c(1:10, 175:180),c(1:5, 61:62)]
## (Intercept) sitexam1 sitexam2 Error(subj/(sit))2 Error(subj/(sit))3
## 1 1 0 0 0 0
## 2 1 1 0 0 0
## 3 1 0 1 0 0
## 4 1 0 0 1 0
## 5 1 1 0 1 0
## 6 1 0 1 1 0
## 7 1 0 0 0 1
## 8 1 1 0 0 1
## 9 1 0 1 0 1
## 10 1 0 0 0 0
## 175 1 0 0 0 0
## 176 1 1 0 0 0
## 177 1 0 1 0 0
## 178 1 0 0 0 0
## 179 1 1 0 0 0
## 180 1 0 1 0 0
## Error(subj/(sit))59 Error(subj/(sit))60
## 1 0 0
## 2 0 0
## 3 0 0
## 4 0 0
## 5 0 0
## 6 0 0
## 7 0 0
## 8 0 0
## 9 0 0
## 10 0 0
## 175 1 0
## 176 1 0
## 177 1 0
## 178 0 1
## 179 0 1
## 180 0 1
# the full model.matrix would be (not printed, because it is huge)
# unique(model.matrix(m.r['aov']$aov))
Wir können jetzt auch sehen, dass die Residualvarianz stark zurückgeht, da der ‘Versuchspersonenfaktor’ viel Varianz bindet.
Außerdem hilft ein Blick auf die model.matrix() zu
verstehen, dass der Versuchspersonenfaktor tatsächlich in n-1
Dummyvariablen zerlegt wird. Jede Beobachtung/Vp/subj erhält also eine
eigene Dummy-Variable als Faktorstufe. Die Spalten laufen von
‘Error(subj/(sit))2’ … ‘Error(subj/(sit))60’.
Schön ist in der model.matrix() zu sehen, dass jede
Versuchsperson für den 3-stufigen Faktor ‘sit’ je drei Mal eine 1 in
ihrer Spalte hat.
aov()# get data
dd <- read.delim("http://md.psych.bio.uni-goettingen.de/mv/data/virt/v_stait_exam_wide.txt")
dd <- dd[dd$group == 'av',c('subj', 'neutr', 'exam1', 'exam2')]
#require(Rcmdr)
require(reshape2)
require(psych)
## Lade nötiges Paket: psych
##
## Attache Paket: 'psych'
## Die folgenden Objekte sind maskiert von 'package:ggplot2':
##
## %+%, alpha
# require(lme4) # is loaded and used by lmerTest
#require(lmerTest)
#head(dd)
#psych::describe(dd)
# convert it to long format
dd.l <- melt(dd, id=c('subj'), variable.name="sit", value.name="anx", measured=c('neutr', 'exam1', 'exam2'))
#head(dd.r)
# fit between subject model
m.b <- aov(anx ~ sit, data=dd.l)
# random intercept model, no interaction
#m.f <- lmer(anx ~ sit + (subj/sit), data=dd.r)
# random intercept
m.w <- aov(anx ~ sit + Error(subj), data=dd.l)
# random intercept, random slope
m.f <- aov(anx ~ sit + Error(subj/sit), data=dd.l)
# constant in between subjects design is the same for all observations
coefficients(m.b)
## (Intercept) sitexam1 sitexam2
## 43.483333 6.383333 6.583333
coefficients(m.w)
## (Intercept) :
## (Intercept)
## 47.80556
##
## subj :
## numeric(0)
##
## Within :
## sitexam1 sitexam2
## 6.383333 6.583333
coefficients(m.f)
## (Intercept) :
## (Intercept)
## 47.80556
##
## subj :
## numeric(0)
##
## subj:sit :
## sitexam1 sitexam2
## 4.732787 6.812568
##
## Within :
## sitexam1 sitexam2
## 11.502825 5.872316
# constant in within subjects design varies and depends on subject
head(coefficients(m.f)$subj)
## numeric(0)
summary(m.b)
## Df Sum Sq Mean Sq F value Pr(>F)
## sit 2 1683 841.3 5.265 0.00601 **
## Residuals 177 28280 159.8
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
summary(m.w)
##
## Error: subj
## Df Sum Sq Mean Sq F value Pr(>F)
## Residuals 1 3105 3105
##
## Error: Within
## Df Sum Sq Mean Sq F value Pr(>F)
## sit 2 1683 841.3 5.881 0.00337 **
## Residuals 176 25175 143.0
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
summary(m.f)
##
## Error: subj
## Df Sum Sq Mean Sq F value Pr(>F)
## Residuals 1 3105 3105
##
## Error: subj:sit
## Df Sum Sq Mean Sq
## sit 2 1106 553.1
##
## Error: Within
## Df Sum Sq Mean Sq F value Pr(>F)
## sit 2 968 484.0 3.398 0.0357 *
## Residuals 174 24783 142.4
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# we look at descriptives
require(Rmisc)
summarySE(data=dd.l, "anx", c("sit"))
## sit N anx sd se ci
## 1 neutr 60 43.48333 12.34324 1.593506 3.188598
## 2 exam1 60 49.86667 13.32192 1.719852 3.441416
## 3 exam2 60 50.06667 12.22649 1.578433 3.158437
detach("package:Rmisc", unload=TRUE)
# `r summary(m.b)[[1]][2,2]`
Im Between-Subjects-Modell m.b gibt es nur die SS des
Effekts ‘sit’ 1683 und die Gesamt-Residuen 28280, in Summe 29963.
Im Within-Subjects-Modell m.w splitten sich die SS in
die within 3105 und between 24783 und interaction 1106 und die ‘sit’
968, in Summe die Gesamt-Residuen
3105 + 24783 + 1106 + 968 = 29962 .
Die Residuen, gegen die die Effekte getestet werden, verkleinern sich
also von 28280 (between) auf 3105 + 24783 = 27888.
two-way within subjects anova:
Daten oder als Lokale tab-delimited Kopie
30 Teilnehmer, deren Interesse an Schule bzw. Arbeit in 3 verschiedenen Altersstufen gemessen wurden, mit 10, 15 und 20 Jahren. Interesse wurde erfasst mit einer Ratingskala von 1 - 5. Je höher der Score, desto höher das Interesse.
# get data
# This dataset contains a hypothetical sample of 30 participants whose interest in school and interest in work was measured at three different ages (10, 15, and 20).
# The interest values are represented on a scale that ranges from 1 to 5 and indicate how interested each participant was in a given topic at each given age.
dd <- read.delim("http://md.psych.bio.uni-goettingen.de/mv/data/div/john_quick_dataset_anova_twoWay_repeatedMeasures.txt")
# first step create age
require(reshape2)
dd.r <- melt(dd, id=c('subject'), variable.name="school_age", value.name="interest", measured=c('schoolAge10', 'schoolAge15', 'schoolAge20', 'workAge10', 'workAge15', 'workAge20'))
#require(Rcmdr)
require(ez)
require(psych)
# require(lme4) # is loaded and used by lmerTest
require(lmerTest)
# add a topic factor
##dd$topic <- c(rep('school', length(dd$schoolAge10) * 3), rep('school', length(dd$workAge10) * 3))
# convert it to long format
# add a topic factor
#dd.r$topic <- factor(c(rep('school', length(dd$schoolAge10) * 3), rep('work', length(dd$workAge10) * 3)), labels=c("school", "work"))
dd.r$topic <- factor(rep(c(1, 2), c(90, 90)), labels=c("school", "work"))
# add a age factor
dd.r$age <- factor(rep(c(1, 2, 3), c(30, 30, 30)), labels=c("a10", "a15", "a20"))
# fit model using ezANOVA
#? random intercept model, no interaction
m.f <- ezANOVA(data=dd.r, dv=.(interest), wid = .(subject), within = .(topic, age), detailed=TRUE, type=3)
## Warning: Converting "subject" to factor for ANOVA.
# parameter
m.f
## $ANOVA
## Effect DFn DFd SSn SSd F p p<.05
## 1 (Intercept) 1 29 1519.60556 48.22778 913.758899 1.775418e-23 *
## 2 topic 1 29 20.67222 91.16111 6.576208 1.577753e-02 *
## 3 age 2 58 12.67778 23.98889 15.326077 4.533383e-06 *
## 4 topic:age 2 58 92.81111 47.85556 56.242628 2.633260e-14 *
## ges
## 1 0.87795899
## 2 0.08914069
## 3 0.05661969
## 4 0.30525508
##
## $`Mauchly's Test for Sphericity`
## Effect W p p<.05
## 3 age 0.8621793 0.1254199
## 4 topic:age 0.8805521 0.1684886
##
## $`Sphericity Corrections`
## Effect GGe p[GG] p[GG]<.05 HFe p[HF] p[HF]<.05
## 3 age 0.8788731 1.408756e-05 * 0.9311342 8.633170e-06 *
## 4 topic:age 0.8932975 5.072347e-13 * 0.9480226 1.111968e-13 *
# ezANOVA can return an aov() object with flag return_aov = FALSE
m.f.aov <- ezANOVA(data=dd.r, dv=.(interest), wid = .(subject), within = .(topic, age), detailed=TRUE, type=3, return_aov = TRUE)
## Warning: Converting "subject" to factor for ANOVA.
m.f.aov
## $ANOVA
## Effect DFn DFd SSn SSd F p p<.05
## 1 (Intercept) 1 29 1519.60556 48.22778 913.758899 1.775418e-23 *
## 2 topic 1 29 20.67222 91.16111 6.576208 1.577753e-02 *
## 3 age 2 58 12.67778 23.98889 15.326077 4.533383e-06 *
## 4 topic:age 2 58 92.81111 47.85556 56.242628 2.633260e-14 *
## ges
## 1 0.87795899
## 2 0.08914069
## 3 0.05661969
## 4 0.30525508
##
## $`Mauchly's Test for Sphericity`
## Effect W p p<.05
## 3 age 0.8621793 0.1254199
## 4 topic:age 0.8805521 0.1684886
##
## $`Sphericity Corrections`
## Effect GGe p[GG] p[GG]<.05 HFe p[HF] p[HF]<.05
## 3 age 0.8788731 1.408756e-05 * 0.9311342 8.633170e-06 *
## 4 topic:age 0.8932975 5.072347e-13 * 0.9480226 1.111968e-13 *
##
## $aov
##
## Call:
## aov(formula = formula(aov_formula), data = data)
##
## Grand Mean: 2.905556
##
## Stratum 1: subject
##
## Terms:
## Residuals
## Sum of Squares 48.22778
## Deg. of Freedom 29
##
## Residual standard error: 1.289584
##
## Stratum 2: subject:topic
##
## Terms:
## topic Residuals
## Sum of Squares 20.67222 91.16111
## Deg. of Freedom 1 29
##
## Residual standard error: 1.772988
## 2 out of 3 effects not estimable
## Estimated effects are balanced
##
## Stratum 3: subject:age
##
## Terms:
## age Residuals
## Sum of Squares 12.67778 23.98889
## Deg. of Freedom 2 58
##
## Residual standard error: 0.6431186
## 2 out of 4 effects not estimable
## Estimated effects may be unbalanced
##
## Stratum 4: subject:topic:age
##
## Terms:
## topic:age Residuals
## Sum of Squares 92.81111 47.85556
## Deg. of Freedom 2 58
##
## Residual standard error: 0.9083478
## Estimated effects may be unbalanced
# ez offers ezPlot()
ezPlot(data=dd.r, dv=.(interest), wid = .(subject), within = .(topic, age), x=.(topic), split=.(age))
## Warning: Converting "subject" to factor for ANOVA.
ezPlot(data=dd.r, dv=.(interest), wid = .(subject), within = .(topic, age), x=.(age), split=.(topic))
## Warning: Converting "subject" to factor for ANOVA.
# ez also offers ezStats()
ezStats(data=dd.r, dv=.(interest), wid = .(subject), within = .(topic, age), type=3)
## Warning: Converting "subject" to factor for ANOVA.
## topic age N Mean SD FLSD
## 1 school a10 30 3.566667 1.1943353 0.4694716
## 2 school a15 30 3.766667 1.3308886 0.4694716
## 3 school a20 30 2.400000 1.2484473 0.4694716
## 4 work a10 30 1.500000 0.6297235 0.4694716
## 5 work a15 30 2.500000 0.9377155 0.4694716
## 6 work a20 30 3.700000 1.1188048 0.4694716
# order matters for output
ezStats(data=dd.r, dv=.(interest), wid = .(subject), within = .(age, topic), type=3)
## Warning: Converting "subject" to factor for ANOVA.
## age topic N Mean SD FLSD
## 1 a10 school 30 3.566667 1.1943353 0.4694716
## 2 a10 work 30 1.500000 0.6297235 0.4694716
## 3 a15 school 30 3.766667 1.3308886 0.4694716
## 4 a15 work 30 2.500000 0.9377155 0.4694716
## 5 a20 school 30 2.400000 1.2484473 0.4694716
## 6 a20 work 30 3.700000 1.1188048 0.4694716
# ?ezStats tells us
# Fisher's Least Significant Difference is computed as sqrt(2)*qt(.975,DFd)*sqrt(MSd/N), where N is taken as the mean N per group in cases of unbalanced designs.
Der Ansatz nutzt Anova() der
library(car).
Messwiederholungsdesigns werden hier mit dem Multivariaten Ansatz gerechnet.
#//Two-Way Repeated Measures ANOVA Example
#//Created by John M. Quick
#//http://www.johnmquick.com
#//January 22, 2011
#read the dataset into an R variable using the read.csv(file) function
#dataTwoWayRepeatedMeasures <- read.csv("dataset_ANOVA_TwoWayRepeatedMeasures.csv")
dataTwoWayRepeatedMeasures <- read.delim("http://md.psych.bio.uni-goettingen.de/mv/data/div/john_quick_dataset_anova_twoWay_repeatedMeasures.txt")
#display the data
#notice the atypical column arrangement for repeated measures data
head(dataTwoWayRepeatedMeasures)
## subject schoolAge10 schoolAge15 schoolAge20 workAge10 workAge15 workAge20
## 1 1 5 5 3 1 3 5
## 2 2 5 5 3 1 3 5
## 3 3 4 3 1 1 2 4
## 4 4 4 5 4 2 2 4
## 5 5 3 5 4 2 3 3
## 6 6 3 1 2 2 2 3
#read the idata frame into an R variable
#idataTwoWayRepeatedMeasures <- read.csv("idata_ANOVA_TwoWayRepeatedMeasures.csv")
idataTwoWayRepeatedMeasures <- read.delim("http://md.psych.bio.uni-goettingen.de/mv/data/div/john_quick_dataset_anova_twoWay_repeatedMeasures_idata.txt")
idataTwoWayRepeatedMeasures$Interest <- factor(idataTwoWayRepeatedMeasures$Interest)
idataTwoWayRepeatedMeasures$Age <- factor(idataTwoWayRepeatedMeasures$Age)
#display the idata frame
#notice the text values and correspondence between our idata rows and the columns in our dataset
idataTwoWayRepeatedMeasures
## Interest Age
## 1 school Age10
## 2 school Age15
## 3 school Age20
## 4 work Age10
## 5 work Age15
## 6 work Age20
#create the linear model
#step 1: bind the columns
#use cbind() to bind the columns in the original dataset
interestBind <- cbind(dataTwoWayRepeatedMeasures$schoolAge10, dataTwoWayRepeatedMeasures$schoolAge15, dataTwoWayRepeatedMeasures$schoolAge20, dataTwoWayRepeatedMeasures$workAge10, dataTwoWayRepeatedMeasures$workAge15, dataTwoWayRepeatedMeasures$workAge20)
#step 2: define the model
#use lm() to generate a linear model using the bound columns from step 4
interestModel <- lm(interestBind ~ 1)
#load the car package (install first, if necessary)
library(car)
## Lade nötiges Paket: carData
##
## Attache Paket: 'car'
## Das folgende Objekt ist maskiert 'package:psych':
##
## logit
#compose the Anova(mod, idata, idesign) function
analysis <- car::Anova(interestModel, idata = idataTwoWayRepeatedMeasures, idesign = ~Interest * Age)
## Note: model has only an intercept; equivalent type-III tests substituted.
#use summary(object) to visualize the results of the repeated measures ANOVA
summary(analysis)
##
## Type III Repeated Measures MANOVA Tests:
##
## ------------------------------------------
##
## Term: (Intercept)
##
## Response transformation matrix:
## (Intercept)
## [1,] 1
## [2,] 1
## [3,] 1
## [4,] 1
## [5,] 1
## [6,] 1
##
## Sum of squares and products for the hypothesis:
## (Intercept)
## (Intercept) 9117.633
##
## Multivariate Tests: (Intercept)
## Df test stat approx F num Df den Df Pr(>F)
## Pillai 1 0.969239 913.7589 1 29 < 2.22e-16 ***
## Wilks 1 0.030761 913.7589 1 29 < 2.22e-16 ***
## Hotelling-Lawley 1 31.508928 913.7589 1 29 < 2.22e-16 ***
## Roy 1 31.508928 913.7589 1 29 < 2.22e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## ------------------------------------------
##
## Term: Interest
##
## Response transformation matrix:
## Interest1
## [1,] 1
## [2,] 1
## [3,] 1
## [4,] -1
## [5,] -1
## [6,] -1
##
## Sum of squares and products for the hypothesis:
## Interest1
## Interest1 124.0333
##
## Multivariate Tests: Interest
## Df test stat approx F num Df den Df Pr(>F)
## Pillai 1 0.1848485 6.576208 1 29 0.015778 *
## Wilks 1 0.8151515 6.576208 1 29 0.015778 *
## Hotelling-Lawley 1 0.2267658 6.576208 1 29 0.015778 *
## Roy 1 0.2267658 6.576208 1 29 0.015778 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## ------------------------------------------
##
## Term: Age
##
## Response transformation matrix:
## Age1 Age2
## [1,] 1 0
## [2,] 0 1
## [3,] -1 -1
## [4,] 1 0
## [5,] 0 1
## [6,] -1 -1
##
## Sum of squares and products for the hypothesis:
## Age1 Age2
## Age1 32.033333 -5.1666667
## Age2 -5.166667 0.8333333
##
## Multivariate Tests: Age
## Df test stat approx F num Df den Df Pr(>F)
## Pillai 1 0.4402156 11.00963 2 28 0.00029668 ***
## Wilks 1 0.5597844 11.00963 2 28 0.00029668 ***
## Hotelling-Lawley 1 0.7864021 11.00963 2 28 0.00029668 ***
## Roy 1 0.7864021 11.00963 2 28 0.00029668 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## ------------------------------------------
##
## Term: Interest:Age
##
## Response transformation matrix:
## Interest1:Age1 Interest1:Age2
## [1,] 1 0
## [2,] 0 1
## [3,] -1 -1
## [4,] -1 0
## [5,] 0 -1
## [6,] 1 1
##
## Sum of squares and products for the hypothesis:
## Interest1:Age1 Interest1:Age2
## Interest1:Age1 340.0333 259.2333
## Interest1:Age2 259.2333 197.6333
##
## Multivariate Tests: Interest:Age
## Df test stat approx F num Df den Df Pr(>F)
## Pillai 1 0.771144 47.17392 2 28 1.0811e-09 ***
## Wilks 1 0.228856 47.17392 2 28 1.0811e-09 ***
## Hotelling-Lawley 1 3.369566 47.17392 2 28 1.0811e-09 ***
## Roy 1 3.369566 47.17392 2 28 1.0811e-09 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Univariate Type III Repeated-Measures ANOVA Assuming Sphericity
##
## Sum Sq num Df Error SS den Df F value Pr(>F)
## (Intercept) 1519.61 1 48.228 29 913.7589 < 2.2e-16 ***
## Interest 20.67 1 91.161 29 6.5762 0.01578 *
## Age 12.68 2 23.989 58 15.3261 4.533e-06 ***
## Interest:Age 92.81 2 47.856 58 56.2426 2.633e-14 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
##
## Mauchly Tests for Sphericity
##
## Test statistic p-value
## Age 0.86218 0.12542
## Interest:Age 0.88055 0.16849
##
##
## Greenhouse-Geisser and Huynh-Feldt Corrections
## for Departure from Sphericity
##
## GG eps Pr(>F[GG])
## Age 0.87887 1.409e-05 ***
## Interest:Age 0.89330 5.072e-13 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## HF eps Pr(>F[HF])
## Age 0.9311342 8.633170e-06
## Interest:Age 0.9480226 1.111968e-13
Quelle: [http://personality-project.org/r/r.anova.html]
Data: [http://personality-project.org/r/datasets/R.appendix4.data] oder als lokale Kopie
Fünf Versuchspersonen wurden gebeten, eine Liste von Wörtern zu lernen. Es gab drei Arten von Wörtern auf der Liste. Es gibt also einen dreifach gestuften Messwiederholungsfaktor Wortart ‘valence’ mit den Stufen ‘neg’, ‘neu’ und ‘pos’. Die Abrufart ‘task’ ist zweifach gestuft (‘free’, cued’) und ebenfalls ein Messwiederholungsfaktor.
Dies hier dient nur der Veranschaulichung der Model-Spezifikation. Ein 2*3 Design mit insgesamt 5 Beobachtungen macht wenig Sinn.
# get data
#require(Rcmdr)
require(reshape2)
require(psych)
# require(lme4) # is loaded and used by lmerTest
require(lmerTest)
dd <- read.delim("http://md.psych.bio.uni-goettingen.de/mv/data/div/pers_proj_4.txt")
# data are in long format, no conversion needed
# fit between subject model
m.b <- lm(recall ~ task * valence, data=dd)
# random intercept model, no interaction
m.u1 <- lmer(recall ~ task + valence + (1|subj), data=dd)
print(m.u1)
## Linear mixed model fit by REML ['lmerModLmerTest']
## Formula: recall ~ task + valence + (1 | subj)
## Data: dd
## REML criterion at convergence: 124.0307
## Random effects:
## Groups Name Std.Dev.
## subj (Intercept) 3.750
## Residual 1.704
## Number of obs: 30, groups: subj, 5
## Fixed Effects:
## (Intercept) taskfree valenceneu valencepos
## 12.0 -2.0 1.1 1.3
anova(m.u1)
## Type III Analysis of Variance Table with Satterthwaite's method
## Sum Sq Mean Sq NumDF DenDF F value Pr(>F)
## task 30.0 30.0 1 22 10.3340 0.003992 **
## valence 9.8 4.9 2 22 1.6879 0.207987
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# random intercept, interaction
m.u2 <- lmer(recall ~ task + valence + task:valence + (1|subj), data=dd)
print(m.u2)
## Linear mixed model fit by REML ['lmerModLmerTest']
## Formula: recall ~ task + valence + task:valence + (1 | subj)
## Data: dd
## REML criterion at convergence: 118.4203
## Random effects:
## Groups Name Std.Dev.
## subj (Intercept) 3.745
## Residual 1.767
## Number of obs: 30, groups: subj, 5
## Fixed Effects:
## (Intercept) taskfree valenceneu
## 11.8 -1.6 1.2
## valencepos taskfree:valenceneu taskfree:valencepos
## 1.8 -0.2 -1.0
anova(m.u2)
## Type III Analysis of Variance Table with Satterthwaite's method
## Sum Sq Mean Sq NumDF DenDF F value Pr(>F)
## task 30.0 30.0 1 20 9.6051 0.005655 **
## valence 9.8 4.9 2 20 1.5688 0.232864
## task:valence 1.4 0.7 2 20 0.2241 0.801200
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# random slope and intercept, no interaction
m.u3 <- lmer(recall ~ task + valence + (valence|subj) + (task|subj), data=dd)
## boundary (singular) fit: see help('isSingular')
print(m.u3)
## Linear mixed model fit by REML ['lmerModLmerTest']
## Formula: recall ~ task + valence + (valence | subj) + (task | subj)
## Data: dd
## REML criterion at convergence: 116.2208
## Random effects:
## Groups Name Std.Dev. Corr
## subj (Intercept) 2.648
## valenceneu 1.208 -0.99
## valencepos 2.004 -0.76 0.84
## subj.1 (Intercept) 3.811
## taskfree 1.459 -1.00
## Residual 1.218
## Number of obs: 30, groups: subj, 5
## Fixed Effects:
## (Intercept) taskfree valenceneu valencepos
## 12.0 -2.0 1.1 1.3
## optimizer (nloptwrap) convergence code: 0 (OK) ; 0 optimizer warnings; 1 lme4 warnings
anova(m.u3)
## Type III Analysis of Variance Table with Satterthwaite's method
## Sum Sq Mean Sq NumDF DenDF F value Pr(>F)
## task 9.5120 9.5120 1 4.2689 6.4137 0.06064 .
## valence 3.2248 1.6124 2 4.7173 1.0872 0.40897
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# random slope and intercept, interaction
m.u4<- lmer(recall ~ task + valence + task:valence + (valence|subj) + (task|subj), data=dd)
## boundary (singular) fit: see help('isSingular')
## Warning: Model failed to converge with 1 negative eigenvalue: -1.1e-01
print(m.u4)
## Linear mixed model fit by REML ['lmerModLmerTest']
## Formula: recall ~ task + valence + task:valence + (valence | subj) + (task |
## subj)
## Data: dd
## REML criterion at convergence: 111.5547
## Random effects:
## Groups Name Std.Dev. Corr
## subj (Intercept) 2.566
## valenceneu 1.178 -1.00
## valencepos 1.950 -0.82 0.82
## subj.1 (Intercept) 3.902
## taskfree 1.448 -1.00
## Residual 1.270
## Number of obs: 30, groups: subj, 5
## Fixed Effects:
## (Intercept) taskfree valenceneu
## 11.8 -1.6 1.2
## valencepos taskfree:valenceneu taskfree:valencepos
## 1.8 -0.2 -1.0
## optimizer (nloptwrap) convergence code: 0 (OK) ; 0 optimizer warnings; 1 lme4 warnings
anova(m.u4)
## Type III Analysis of Variance Table with Satterthwaite's method
## Sum Sq Mean Sq NumDF DenDF F value Pr(>F)
## task 10.1673 10.1673 1 4.3168 6.3064 0.06145 .
## valence 3.5184 1.7592 2 4.9515 1.0912 0.40493
## task:valence 1.4000 0.7000 2 13.1665 0.4342 0.65674
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# unclear what is modelled here
m.u5<- lmer(recall ~ task * valence + (1|subj) + (1|valence:subj) + (1|task:subj), data=dd)
print(m.u5)
## Linear mixed model fit by REML ['lmerModLmerTest']
## Formula: recall ~ task * valence + (1 | subj) + (1 | valence:subj) + (1 |
## task:subj)
## Data: dd
## REML criterion at convergence: 117.993
## Random effects:
## Groups Name Std.Dev.
## valence:subj (Intercept) 0.6892
## task:subj (Intercept) 0.7472
## subj (Intercept) 3.7025
## Residual 1.5519
## Number of obs: 30, groups: valence:subj, 15; task:subj, 10; subj, 5
## Fixed Effects:
## (Intercept) taskfree valenceneu
## 11.8 -1.6 1.2
## valencepos taskfree:valenceneu taskfree:valencepos
## 1.8 -0.2 -1.0
anova(m.u5)
## Type III Analysis of Variance Table with Satterthwaite's method
## Sum Sq Mean Sq NumDF DenDF F value Pr(>F)
## task 17.6939 17.6939 1 4 7.3470 0.05351 .
## valence 7.0278 3.5139 2 8 1.4591 0.28825
## task:valence 1.4000 0.7000 2 8 0.2907 0.75534
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# model random slope and intercept for all fixed factors
# full model: random slope and intercept, interaction
m.u6<- lmer(recall ~ task * valence + (1 + valence|subj) + (1 + task|subj), data=dd)
## boundary (singular) fit: see help('isSingular')
## Warning: Model failed to converge with 1 negative eigenvalue: -1.1e-01
print(m.u6)
## Linear mixed model fit by REML ['lmerModLmerTest']
## Formula: recall ~ task * valence + (1 + valence | subj) + (1 + task |
## subj)
## Data: dd
## REML criterion at convergence: 111.5547
## Random effects:
## Groups Name Std.Dev. Corr
## subj (Intercept) 2.566
## valenceneu 1.178 -1.00
## valencepos 1.950 -0.82 0.82
## subj.1 (Intercept) 3.902
## taskfree 1.448 -1.00
## Residual 1.270
## Number of obs: 30, groups: subj, 5
## Fixed Effects:
## (Intercept) taskfree valenceneu
## 11.8 -1.6 1.2
## valencepos taskfree:valenceneu taskfree:valencepos
## 1.8 -0.2 -1.0
## optimizer (nloptwrap) convergence code: 0 (OK) ; 0 optimizer warnings; 1 lme4 warnings
anova(m.u6)
## Type III Analysis of Variance Table with Satterthwaite's method
## Sum Sq Mean Sq NumDF DenDF F value Pr(>F)
## task 10.1673 10.1673 1 4.3168 6.3064 0.06145 .
## valence 3.5184 1.7592 2 4.9515 1.0912 0.40493
## task:valence 1.4000 0.7000 2 13.1665 0.4342 0.65674
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# compare non interaction model with full interaction model
anova(m.u1, m.u6)
## refitting model(s) with ML (instead of REML)
## Data: dd
## Models:
## m.u1: recall ~ task + valence + (1 | subj)
## m.u6: recall ~ task * valence + (1 + valence | subj) + (1 + task | subj)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## m.u1 6 141.81 150.22 -64.907 129.81
## m.u6 16 152.68 175.10 -60.339 120.68 9.1362 10 0.5192
# descriptives - the Rcmdr way
tapply(dd$recall, list(situation=dd$task, dd$valence), mean, na.rm=TRUE) # means
##
## situation neg neu pos
## cued 11.8 13.0 13.6
## free 10.2 11.2 11.0
tapply(dd$recall, list(situation=dd$task, dd$valence), function(x) sum(!is.na(x))) # counts
##
## situation neg neu pos
## cued 5 5 5
## free 5 5 5
# the psych package way
by(dd$recall, list(dd$task, dd$valence), psych::describe)
## : cued
## : neg
## vars n mean sd median trimmed mad min max range skew kurtosis se
## X1 1 5 11.8 5.89 15 11.8 2.97 4 17 13 -0.33 -2.11 2.63
## ------------------------------------------------------------
## : free
## : neg
## vars n mean sd median trimmed mad min max range skew kurtosis se
## X1 1 5 10.2 3.03 12 10.2 1.48 6 13 7 -0.37 -2.01 1.36
## ------------------------------------------------------------
## : cued
## : neu
## vars n mean sd median trimmed mad min max range skew kurtosis se
## X1 1 5 13 4.06 13 13 5.93 9 18 9 0.07 -2.11 1.82
## ------------------------------------------------------------
## : free
## : neu
## vars n mean sd median trimmed mad min max range skew kurtosis se
## X1 1 5 11.2 3.03 13 11.2 1.48 7 14 7 -0.37 -2.01 1.36
## ------------------------------------------------------------
## : cued
## : pos
## vars n mean sd median trimmed mad min max range skew kurtosis se
## X1 1 5 13.6 4.1 14 13.6 5.93 10 20 10 0.49 -1.57 1.83
## ------------------------------------------------------------
## : free
## : pos
## vars n mean sd median trimmed mad min max range skew kurtosis se
## X1 1 5 11 4.06 12 11 4.45 5 15 10 -0.39 -1.79 1.82
Dasselbe mit ezANOVA()
dd <- read.delim("http://md.psych.bio.uni-goettingen.de/mv/data/div/pers_proj_4.txt")
require(ez)
ezANOVA(data=dd, dv=recall, wid=subj, within=.(task, valence), detailed=T, type=1 )
## Warning: Converting "subj" to factor for ANOVA.
## Warning: Converting "task" to factor for ANOVA.
## Warning: Converting "valence" to factor for ANOVA.
## $ANOVA
## Effect DFn DFd SSn SSd F p p<.05 ges
## 1 task 1 4 30.0 16.33333 7.3469388 0.0535083 0.32444124
## 2 valence 2 8 9.8 26.86667 1.4590571 0.2882501 0.13560886
## 3 task:valence 2 8 1.4 19.26667 0.2906574 0.7553437 0.02192067
ezPlot(data=dd, dv=recall, x=task, wid=subj, within=.(task, valence), split=valence)
## Warning: Converting "subj" to factor for ANOVA.
## Warning: Converting "task" to factor for ANOVA.
## Warning: Converting "valence" to factor for ANOVA.
ezPlot(data=dd, dv=recall, x=valence, wid=subj, within=.(task, valence), split=task)
## Warning: Converting "subj" to factor for ANOVA.
## Warning: Converting "task" to factor for ANOVA.
## Warning: Converting "valence" to factor for ANOVA.
nlme()siehe: Mehrebenenmodelle
Zur Anwendung kommt aov() und nicht
nlme().
Example 4. Two-Way Within-Subjects ANOVA
Five subjects are asked to memorize a list of words. The words on this list are of three types: positive words, negative words and neutral words. Their recall data by word type is displayed in Appendix III. Note that there is a single factor (Valence ) with three levels (negative, neutral and positive). In addition, there is also a random factor Subject . Create a data file ex3 that contains this data. Again it is important that each observation appears on an individual line! Note that this is not the standard way of thinking about data. Example 6 will show how to transform data from the standard data table into this form.
In this example, Subject is crossed with both Task and Valence, so you must specify three different error terms: one forTask , one for Valence and one for the interaction between the two.
Dosage Three levels of drug were administered to 18 subjects.
In this example, Subject is crossed with both Task and Valence, so you must specify three different error terms: one forTask , one for Valence and one for the interaction between the two. Fortunately, R is smart enough to divide up the within-subjects error term properly as long as you specify it in your command. The commands are:
# the original
datafilename="http://personality-project.org/r/datasets/R.appendix4.data"
data.ex4=read.table(datafilename,header=T) #read the data into a table
data.ex4 #show the data
## Observation Subject Task Valence Recall
## 1 1 Jim Free Neg 8
## 2 2 Jim Free Neu 9
## 3 3 Jim Free Pos 5
## 4 4 Jim Cued Neg 7
## 5 5 Jim Cued Neu 9
## 6 6 Jim Cued Pos 10
## 7 7 Victor Free Neg 12
## 8 8 Victor Free Neu 13
## 9 9 Victor Free Pos 14
## 10 10 Victor Cued Neg 16
## 11 11 Victor Cued Neu 13
## 12 12 Victor Cued Pos 14
## 13 13 Faye Free Neg 13
## 14 14 Faye Free Neu 13
## 15 15 Faye Free Pos 12
## 16 16 Faye Cued Neg 15
## 17 17 Faye Cued Neu 16
## 18 18 Faye Cued Pos 14
## 19 19 Ron Free Neg 12
## 20 20 Ron Free Neu 14
## 21 21 Ron Free Pos 15
## 22 22 Ron Cued Neg 17
## 23 23 Ron Cued Neu 18
## 24 24 Ron Cued Pos 20
## 25 25 Jason Free Neg 6
## 26 26 Jason Free Neu 7
## 27 27 Jason Free Pos 9
## 28 28 Jason Cued Neg 4
## 29 29 Jason Cued Neu 9
## 30 30 Jason Cued Pos 10
aov.ex4=aov(Recall~(Task*Valence)+Error(Subject/(Task*Valence)),data.ex4 )
summary(aov.ex4)
##
## Error: Subject
## Df Sum Sq Mean Sq F value Pr(>F)
## Residuals 4 349.1 87.28
##
## Error: Subject:Task
## Df Sum Sq Mean Sq F value Pr(>F)
## Task 1 30.00 30.000 7.347 0.0535 .
## Residuals 4 16.33 4.083
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Error: Subject:Valence
## Df Sum Sq Mean Sq F value Pr(>F)
## Valence 2 9.80 4.900 1.459 0.288
## Residuals 8 26.87 3.358
##
## Error: Subject:Task:Valence
## Df Sum Sq Mean Sq F value Pr(>F)
## Task:Valence 2 1.40 0.700 0.291 0.755
## Residuals 8 19.27 2.408
print(model.tables(aov.ex4,"means"),digits=3) #report the means and the number of subjects/cell
## Tables of means
## Grand mean
##
## 11.8
##
## Task
## Task
## Cued Free
## 12.8 10.8
##
## Valence
## Valence
## Neg Neu Pos
## 11.0 12.1 12.3
##
## Task:Valence
## Valence
## Task Neg Neu Pos
## Cued 11.8 13.0 13.6
## Free 10.2 11.2 11.0
boxplot(Recall~Task*Valence,data=data.ex4) #graphical summary of means of the 6 cells
weniger Vpn erhöhte Power - durch Vpn-‘Faktor’ - Error Term wird nochmals aufgeteilt in Vpn-Varianz und Restvarianz
Beispiel AB/AC gepaartes assoziatives Lernen - Vpn lernen Liste von Begriffspaaren A-B - Danach lernen sie eine Liste von Begriffspaaren A-C - Dann ist das Erinnern von B nach Hinweisreiz A gestört - Retroaktive Interferenz
Schönes Beispiel: Auswirkung auf F-Ratio, wenn innerhalb einer Beobachtung Werte vertauscht werden und somit die unsystematische Varianz erhöht wird.
Counterbalancing - blocked design: Zufallszuweisung zu einer der möglichen Ablaufreihenfolgen - schon bei 3 Faktorstufen 3! = 6, bei 4 Faktorstufen 4! = 24 Kombinationsmöglichkeiten - latin squares - jede Bedingung taucht mindestens einmal an jeder möglichen Postition auf - 3 Faktorstufen: A1, A2, A3 - A2, A3, A1 - A3, A1, A2 - Anzahl der Kombinationen = Anzahl der Faktorstufen
Missing data - relative Anzahl der missings: Daumenregel: weniger als 10% in einer Variablen ist akzeptabel, mehr nicht - Muster in missings T-Test der Personen mit Missings in einer Variable gegen die ohne Missings. Sig. -> Muster - Umgang - alle Beobachtungen mit Missings in relevanter Variable aus Analyse nehmen (teuer) - Schätzen der Missings - Ersetzen durch Mittelwert: Konservativ, vermindert Varianz systematisch - Regressionsbasierte Schätzung: besser, aber welche Variablen zur Vorhersage benutzen?
Bildgebende Verfahren und deren Kosten Vpn in Reaktionszeit-Experimenten: Es gibt Langsame und Schnelle, unabhängig von der jeweiligen Aufgabe (Faktorstufe)
aov()
# wide format
subject storeA storeB storeC storeD
lettuce 1.17 1.78 1.29 1.29
potatoes 1.77 1.98 1.99 1.99
milk 1.49 1.69 1.79 1.59
eggs 0.65 0.99 0.69 1.09
bread 1.58 1.70 1.89 1.89
cereal 3.13 3.15 2.99 3.09
ground.beef 2.09 1.88 2.09 2.49
tomato.soup 0.62 0.65 0.65 0.69
laundry.detergent 5.89 5.99 5.99 6.99
aspirin 4.46 4.84 4.99 5.15
# long format
groceries2 # take a look
price store subject
1 1.17 storeA lettuce
2 1.77 storeA potatoes
3 1.49 storeA milk
4 0.65 storeA eggs
5 1.58 storeA bread
6 3.13 storeA cereal
7 2.09 storeA ground.beef
8 0.62 storeA tomato.soup
9 5.89 storeA laundry.detergent
10 4.46 storeA aspirin
11 1.78 storeB lettuce
12 1.98 storeB potatoes
13 1.69 storeB milk
14 0.99 storeB eggs
15 1.70 storeB bread
16 3.15 storeB cereal
17 1.88 storeB ground.beef
18 0.65 storeB tomato.soup
19 5.99 storeB laundry.detergent
20 4.84 storeB aspirin
21 1.29 storeC lettuce
22 1.99 storeC potatoes
23 1.79 storeC milk
24 0.69 storeC eggs
25 1.89 storeC bread
26 2.99 storeC cereal
27 2.09 storeC ground.beef
28 0.65 storeC tomato.soup
29 5.99 storeC laundry.detergent
30 4.99 storeC aspirin
31 1.29 storeD lettuce
32 1.99 storeD potatoes
33 1.59 storeD milk
34 1.09 storeD eggs
35 1.89 storeD bread
36 3.09 storeD cereal
37 2.49 storeD ground.beef
38 0.69 storeD tomato.soup
39 6.99 storeD laundry.detergent
40 5.15 storeD aspirin
aov.out = aov(price ~ store + Error(subject/store), data=groceries2)
Two factor design with repeated measures on both factors:
DV ~ IV1 * IV2 + Error(subject/(IV1*IV2))
Three factor mixed design with repeated measures on IV2 and IV3:
DV ~ IV1 * IV2 * IV3 + Error(subject/(IV2*IV3))
Aus dem Buch Rutherford stammt das folgende Beispiel. 8 Personen Faktor A (Lerntechnik, zweistufig) als Messwiederholungsfaktor (within subjects factor) Faktor B (Lernzeit, dreistufig) als Messwiederholungsfaktor (within subjects factor)
Bemerkung: Nur eine Reihenfolge sinnvoll (memorize -> mnemonics). Dadurch unkontrollierte Reihenfolgeeffekte.
Datensatz long mit Effekt-Kodierung. Die Dummy-Kodierung drückt Effekte aus als Abweichungen vom Gesamtmittelwert (grand mean) und nicht als Abweichungen von einer Referenzgruppe. Die Koeffizienten müssen also anders interpretiert werden, die Anova-Tabelle bleibt gleich.
Berechnung der Effekte unter Nutzung der Dummy-Variablen, die im Datensatz hinterlegt sind.
dd <- read.delim("http://md.psych.bio.uni-goettingen.de/mv/data/div/rutherford_2x3_rep_long_effectcoded.txt")
head(dd)
## vp y Lernbed Lernzeit dS1 dS2 dS3 dS4 dS5 dS6 dS7 dA dB1 dB2 dSxA1 dSxA2
## 1 s1 7 1 30 1 0 0 0 0 0 0 1 1 0 1 0
## 2 s2 3 1 30 0 1 0 0 0 0 0 1 1 0 0 1
## 3 s3 6 1 30 0 0 1 0 0 0 0 1 1 0 0 0
## 4 s4 6 1 30 0 0 0 1 0 0 0 1 1 0 0 0
## 5 s5 5 1 30 0 0 0 0 1 0 0 1 1 0 0 0
## 6 s6 8 1 30 0 0 0 0 0 1 0 1 1 0 0 0
## dSxA3 dSxA4 dSxA5 dSxA6 dSxA7 dSxB1 dSxB2 dSxB3 dSxB4 dSxB5 dSxB6 dSxB7 dSxB8
## 1 0 0 0 0 0 1 0 0 0 0 0 0 0
## 2 0 0 0 0 0 0 1 0 0 0 0 0 0
## 3 1 0 0 0 0 0 0 1 0 0 0 0 0
## 4 0 1 0 0 0 0 0 0 1 0 0 0 0
## 5 0 0 1 0 0 0 0 0 0 1 0 0 0
## 6 0 0 0 1 0 0 0 0 0 0 1 0 0
## dSxB9 dSxB10 dSxB11 dSxB12 dSxB13 dSxB14 dAxB1 dAxB2
## 1 0 0 0 0 0 0 1 0
## 2 0 0 0 0 0 0 1 0
## 3 0 0 0 0 0 0 1 0
## 4 0 0 0 0 0 0 1 0
## 5 0 0 0 0 0 0 1 0
## 6 0 0 0 0 0 0 1 0
unique(dd[,3:length(names(dd))])
## Lernbed Lernzeit dS1 dS2 dS3 dS4 dS5 dS6 dS7 dA dB1 dB2 dSxA1 dSxA2 dSxA3
## 1 1 30 1 0 0 0 0 0 0 1 1 0 1 0 0
## 2 1 30 0 1 0 0 0 0 0 1 1 0 0 1 0
## 3 1 30 0 0 1 0 0 0 0 1 1 0 0 0 1
## 4 1 30 0 0 0 1 0 0 0 1 1 0 0 0 0
## 5 1 30 0 0 0 0 1 0 0 1 1 0 0 0 0
## 6 1 30 0 0 0 0 0 1 0 1 1 0 0 0 0
## 7 1 30 0 0 0 0 0 0 1 1 1 0 0 0 0
## 8 1 30 -1 -1 -1 -1 -1 -1 -1 1 1 0 -1 -1 -1
## 9 1 60 1 0 0 0 0 0 0 1 0 1 1 0 0
## 10 1 60 0 1 0 0 0 0 0 1 0 1 0 1 0
## 11 1 60 0 0 1 0 0 0 0 1 0 1 0 0 1
## 12 1 60 0 0 0 1 0 0 0 1 0 1 0 0 0
## 13 1 60 0 0 0 0 1 0 0 1 0 1 0 0 0
## 14 1 60 0 0 0 0 0 1 0 1 0 1 0 0 0
## 15 1 60 0 0 0 0 0 0 1 1 0 1 0 0 0
## 16 1 60 -1 -1 -1 -1 -1 -1 -1 1 0 1 -1 -1 -1
## 17 1 180 1 0 0 0 0 0 0 1 -1 -1 1 0 0
## 18 1 180 0 1 0 0 0 0 0 1 -1 -1 0 1 0
## 19 1 180 0 0 1 0 0 0 0 1 -1 -1 0 0 1
## 20 1 180 0 0 0 1 0 0 0 1 -1 -1 0 0 0
## 21 1 180 0 0 0 0 1 0 0 1 -1 -1 0 0 0
## 22 1 180 0 0 0 0 0 1 0 1 -1 -1 0 0 0
## 23 1 180 0 0 0 0 0 0 1 1 -1 -1 0 0 0
## 24 1 180 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1
## 25 2 30 1 0 0 0 0 0 0 -1 1 0 -1 0 0
## 26 2 30 0 1 0 0 0 0 0 -1 1 0 0 -1 0
## 27 2 30 0 0 1 0 0 0 0 -1 1 0 0 0 -1
## 28 2 30 0 0 0 1 0 0 0 -1 1 0 0 0 0
## 29 2 30 0 0 0 0 1 0 0 -1 1 0 0 0 0
## 30 2 30 0 0 0 0 0 1 0 -1 1 0 0 0 0
## 31 2 30 0 0 0 0 0 0 1 -1 1 0 0 0 0
## 32 2 30 -1 -1 -1 -1 -1 -1 -1 -1 1 0 1 1 1
## 33 2 60 1 0 0 0 0 0 0 -1 0 1 -1 0 0
## 34 2 60 0 1 0 0 0 0 0 -1 0 1 0 -1 0
## 35 2 60 0 0 1 0 0 0 0 -1 0 1 0 0 -1
## 36 2 60 0 0 0 1 0 0 0 -1 0 1 0 0 0
## 37 2 60 0 0 0 0 1 0 0 -1 0 1 0 0 0
## 38 2 60 0 0 0 0 0 1 0 -1 0 1 0 0 0
## 39 2 60 0 0 0 0 0 0 1 -1 0 1 0 0 0
## 40 2 60 -1 -1 -1 -1 -1 -1 -1 -1 0 1 1 1 1
## 41 2 180 1 0 0 0 0 0 0 -1 -1 -1 -1 0 0
## 42 2 180 0 1 0 0 0 0 0 -1 -1 -1 0 -1 0
## 43 2 180 0 0 1 0 0 0 0 -1 -1 -1 0 0 -1
## 44 2 180 0 0 0 1 0 0 0 -1 -1 -1 0 0 0
## 45 2 180 0 0 0 0 1 0 0 -1 -1 -1 0 0 0
## 46 2 180 0 0 0 0 0 1 0 -1 -1 -1 0 0 0
## 47 2 180 0 0 0 0 0 0 1 -1 -1 -1 0 0 0
## 48 2 180 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 1 1
## dSxA4 dSxA5 dSxA6 dSxA7 dSxB1 dSxB2 dSxB3 dSxB4 dSxB5 dSxB6 dSxB7 dSxB8
## 1 0 0 0 0 1 0 0 0 0 0 0 0
## 2 0 0 0 0 0 1 0 0 0 0 0 0
## 3 0 0 0 0 0 0 1 0 0 0 0 0
## 4 1 0 0 0 0 0 0 1 0 0 0 0
## 5 0 1 0 0 0 0 0 0 1 0 0 0
## 6 0 0 1 0 0 0 0 0 0 1 0 0
## 7 0 0 0 1 0 0 0 0 0 0 1 0
## 8 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 0
## 9 0 0 0 0 0 0 0 0 0 0 0 1
## 10 0 0 0 0 0 0 0 0 0 0 0 0
## 11 0 0 0 0 0 0 0 0 0 0 0 0
## 12 1 0 0 0 0 0 0 0 0 0 0 0
## 13 0 1 0 0 0 0 0 0 0 0 0 0
## 14 0 0 1 0 0 0 0 0 0 0 0 0
## 15 0 0 0 1 0 0 0 0 0 0 0 0
## 16 -1 -1 -1 -1 0 0 0 0 0 0 0 -1
## 17 0 0 0 0 -1 0 0 0 0 0 0 -1
## 18 0 0 0 0 0 -1 0 0 0 0 0 0
## 19 0 0 0 0 0 0 -1 0 0 0 0 0
## 20 1 0 0 0 0 0 0 -1 0 0 0 0
## 21 0 1 0 0 0 0 0 0 -1 0 0 0
## 22 0 0 1 0 0 0 0 0 0 -1 0 0
## 23 0 0 0 1 0 0 0 0 0 0 -1 0
## 24 -1 -1 -1 -1 1 1 1 1 1 1 1 1
## 25 0 0 0 0 1 0 0 0 0 0 0 0
## 26 0 0 0 0 0 1 0 0 0 0 0 0
## 27 0 0 0 0 0 0 1 0 0 0 0 0
## 28 -1 0 0 0 0 0 0 1 0 0 0 0
## 29 0 -1 0 0 0 0 0 0 1 0 0 0
## 30 0 0 -1 0 0 0 0 0 0 1 0 0
## 31 0 0 0 -1 0 0 0 0 0 0 1 0
## 32 1 1 1 1 -1 -1 -1 -1 -1 -1 -1 0
## 33 0 0 0 0 0 0 0 0 0 0 0 1
## 34 0 0 0 0 0 0 0 0 0 0 0 0
## 35 0 0 0 0 0 0 0 0 0 0 0 0
## 36 -1 0 0 0 0 0 0 0 0 0 0 0
## 37 0 -1 0 0 0 0 0 0 0 0 0 0
## 38 0 0 -1 0 0 0 0 0 0 0 0 0
## 39 0 0 0 -1 0 0 0 0 0 0 0 0
## 40 1 1 1 1 0 0 0 0 0 0 0 -1
## 41 0 0 0 0 -1 0 0 0 0 0 0 -1
## 42 0 0 0 0 0 -1 0 0 0 0 0 0
## 43 0 0 0 0 0 0 -1 0 0 0 0 0
## 44 -1 0 0 0 0 0 0 -1 0 0 0 0
## 45 0 -1 0 0 0 0 0 0 -1 0 0 0
## 46 0 0 -1 0 0 0 0 0 0 -1 0 0
## 47 0 0 0 -1 0 0 0 0 0 0 -1 0
## 48 1 1 1 1 1 1 1 1 1 1 1 1
## dSxB9 dSxB10 dSxB11 dSxB12 dSxB13 dSxB14 dAxB1 dAxB2
## 1 0 0 0 0 0 0 1 0
## 2 0 0 0 0 0 0 1 0
## 3 0 0 0 0 0 0 1 0
## 4 0 0 0 0 0 0 1 0
## 5 0 0 0 0 0 0 1 0
## 6 0 0 0 0 0 0 1 0
## 7 0 0 0 0 0 0 1 0
## 8 0 0 0 0 0 0 1 0
## 9 0 0 0 0 0 0 0 1
## 10 1 0 0 0 0 0 0 1
## 11 0 1 0 0 0 0 0 1
## 12 0 0 1 0 0 0 0 1
## 13 0 0 0 1 0 0 0 1
## 14 0 0 0 0 1 0 0 1
## 15 0 0 0 0 0 1 0 1
## 16 -1 -1 -1 -1 -1 -1 0 1
## 17 0 0 0 0 0 0 -1 -1
## 18 -1 0 0 0 0 0 -1 -1
## 19 0 -1 0 0 0 0 -1 -1
## 20 0 0 -1 0 0 0 -1 -1
## 21 0 0 0 -1 0 0 -1 -1
## 22 0 0 0 0 -1 0 -1 -1
## 23 0 0 0 0 0 -1 -1 -1
## 24 1 1 1 1 1 1 -1 -1
## 25 0 0 0 0 0 0 -1 0
## 26 0 0 0 0 0 0 -1 0
## 27 0 0 0 0 0 0 -1 0
## 28 0 0 0 0 0 0 -1 0
## 29 0 0 0 0 0 0 -1 0
## 30 0 0 0 0 0 0 -1 0
## 31 0 0 0 0 0 0 -1 0
## 32 0 0 0 0 0 0 -1 0
## 33 0 0 0 0 0 0 0 -1
## 34 1 0 0 0 0 0 0 -1
## 35 0 1 0 0 0 0 0 -1
## 36 0 0 1 0 0 0 0 -1
## 37 0 0 0 1 0 0 0 -1
## 38 0 0 0 0 1 0 0 -1
## 39 0 0 0 0 0 1 0 -1
## 40 -1 -1 -1 -1 -1 -1 0 -1
## 41 0 0 0 0 0 0 1 1
## 42 -1 0 0 0 0 0 1 1
## 43 0 -1 0 0 0 0 1 1
## 44 0 0 -1 0 0 0 1 1
## 45 0 0 0 -1 0 0 1 1
## 46 0 0 0 0 -1 0 1 1
## 47 0 0 0 0 0 -1 1 1
## 48 1 1 1 1 1 1 1 1
#mm <- aov(anx ~ sit + Error(subj/sit), data=dd.l)
mm <- aov(y ~ Lernbed + Error(vp/Lernbed), data=dd)
mm
##
## Call:
## aov(formula = y ~ Lernbed + Error(vp/Lernbed), data = dd)
##
## Grand Mean: 12
##
## Stratum 1: vp
##
## Terms:
## Residuals
## Sum of Squares 52
## Deg. of Freedom 7
##
## Residual standard error: 2.725541
##
## Stratum 2: vp:Lernbed
##
## Terms:
## Lernbed Residuals
## Sum of Squares 432.0000 75.3333
## Deg. of Freedom 1 7
##
## Residual standard error: 3.280534
## Estimated effects are balanced
##
## Stratum 3: Within
##
## Terms:
## Residuals
## Sum of Squares 1148.667
## Deg. of Freedom 32
##
## Residual standard error: 5.991313
summary(mm)
##
## Error: vp
## Df Sum Sq Mean Sq F value Pr(>F)
## Residuals 7 52 7.429
##
## Error: vp:Lernbed
## Df Sum Sq Mean Sq F value Pr(>F)
## Lernbed 1 432.0 432.0 40.14 0.000391 ***
## Residuals 7 75.3 10.8
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Error: Within
## Df Sum Sq Mean Sq F value Pr(>F)
## Residuals 32 1149 35.9
# todo: to be completed
Beispiel Bushtucker von Andy Field.
Vergleich der Anwendung verschiedener Pakete bei einem
einfaktoriellen Messwiederholungsmodell: - aov() mit
Errorterm - nlme::lme() - lme4::lmer()
Das Ratten-Beispiel von Everitt (2010) link
Beispiel Moore link
todo: posthoc test http://stats.stackexchange.com/questions/49108/post-hoc-test-after-2-factor-repeated-measures-anova-in-r
require(psych)
require(effects)
## Lade nötiges Paket: effects
## Use the command
## lattice::trellis.par.set(effectsTheme())
## to customize lattice options for effects plots.
## See ?effectsTheme for details.
# across subjects then conditions
data1<-c(
49,47,46,47,48,47,41,46,43,47,46,45,
48,46,47,45,49,44,44,45,42,45,45,40,
49,46,47,45,49,45,41,43,44,46,45,40,
45,43,44,45,48,46,40,45,40,45,47,40)
# create data frame and arm factors
matrix(data1,
ncol= 4,
dimnames = list(paste("subj", 1:12), c("Shape1.Color1", "Shape2.Color1", "Shape1.Color2", "Shape2.Color2")))
## Shape1.Color1 Shape2.Color1 Shape1.Color2 Shape2.Color2
## subj 1 49 48 49 45
## subj 2 47 46 46 43
## subj 3 46 47 47 44
## subj 4 47 45 45 45
## subj 5 48 49 49 48
## subj 6 47 44 45 46
## subj 7 41 44 41 40
## subj 8 46 45 43 45
## subj 9 43 42 44 40
## subj 10 47 45 46 45
## subj 11 46 45 45 47
## subj 12 45 40 40 40
Hays.df <- data.frame(rt = data1,
subj = factor(rep(paste("subj", 1:12, sep=""), 4)),
shape = factor(rep(rep(c("shape1", "shape2"), c(12, 12)), 2)),
color = factor(rep(c("color1", "color2"), c(24, 24))))
head(Hays.df)
## rt subj shape color
## 1 49 subj1 shape1 color1
## 2 47 subj2 shape1 color1
## 3 46 subj3 shape1 color1
## 4 47 subj4 shape1 color1
## 5 48 subj5 shape1 color1
## 6 47 subj6 shape1 color1
m.r <- aov(rt ~ shape * color + Error(subj/(shape * color)), data=Hays.df)
m.r
##
## Call:
## aov(formula = rt ~ shape * color + Error(subj/(shape * color)),
## data = Hays.df)
##
## Grand Mean: 45
##
## Stratum 1: subj
##
## Terms:
## Residuals
## Sum of Squares 226.5
## Deg. of Freedom 11
##
## Residual standard error: 4.537721
##
## Stratum 2: subj:shape
##
## Terms:
## shape Residuals
## Sum of Squares 12.0 17.5
## Deg. of Freedom 1 11
##
## Residual standard error: 1.261312
## 1 out of 2 effects not estimable
## Estimated effects are balanced
##
## Stratum 3: subj:color
##
## Terms:
## color Residuals
## Sum of Squares 12.0 9.5
## Deg. of Freedom 1 11
##
## Residual standard error: 0.9293204
## 1 out of 2 effects not estimable
## Estimated effects are balanced
##
## Stratum 4: subj:shape:color
##
## Terms:
## shape:color Residuals
## Sum of Squares 0.0 30.5
## Deg. of Freedom 1 11
##
## Residual standard error: 1.665151
## Estimated effects are balanced
summary(m.r)
##
## Error: subj
## Df Sum Sq Mean Sq F value Pr(>F)
## Residuals 11 226.5 20.59
##
## Error: subj:shape
## Df Sum Sq Mean Sq F value Pr(>F)
## shape 1 12.0 12.000 7.543 0.019 *
## Residuals 11 17.5 1.591
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Error: subj:color
## Df Sum Sq Mean Sq F value Pr(>F)
## color 1 12.0 12.000 13.89 0.00334 **
## Residuals 11 9.5 0.864
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Error: subj:shape:color
## Df Sum Sq Mean Sq F value Pr(>F)
## shape:color 1 0.0 0.000 0 1
## Residuals 11 30.5 2.773
# ez way
require(ez)
ezANOVA(data=Hays.df, dv=rt, wid=subj, within=.(shape, color), detailed=T, type=1 )
## $ANOVA
## Effect DFn DFd SSn SSd F p p<.05
## 1 shape 1 11 1.200000e+01 17.5 7.542857e+00 0.019011750 *
## 2 color 1 11 1.200000e+01 9.5 1.389474e+01 0.003337568 *
## 3 shape:color 1 11 5.750796e-28 30.5 2.074058e-28 1.000000000
## ges
## 1 1.726619e-01
## 2 1.726619e-01
## 3 1.000138e-29
ezPlot(data=Hays.df, dv=rt, x=shape, wid=subj, within=.(shape, color), split=color)
#ezANOVA(data=Hays.df, dv=rt, wid=subj, within=.(shape, color), detailed=T, type=3 )
#ezANOVA(data=dd.l, dv=.(anx), wid = .(subj), within = .(sit), detailed=TRUE, type=1)
rm(Hays.df)
Beispiel mit den Daten von Rutherford (2002).
Site mit Links auf relevante Sites über Messwiederholungsdesigns mit R
Bates Vortrag mit sleepstudy data [http://lme4.r-forge.r-project.org/slides/2011-03-16-Amsterdam/2Longitudinal.pdf] {} Field (2012, p. 245) Kapitel 7: Regression
Eine komfortable Art für “klassische” varianzanalytische
Messwiederholungs- und Mixed-Designs bieten Meta-Packages wie
library(ez), die pipe-friendly library(rstatx)
oder `library(afex) mit lme4 Inclusion und flexiblen Effektstärken
etc.
Nehmen wir an, wir interessieren uns für die Erklärkraft der intervallskalierten Variable \(year\) für das ebenfalls intervallskalierte Outcome-Maß \(gdp\) in einem hierarchischen Datensatz, gruppiert nach Variable \(country\),
Dabei haben wir ein Modell mit random Intercept und random Slope im Kopf:
\[ \widehat{gdp}_{ij} = fixed_{00} + random_{0i} + (fixed_{1i} + random_{1i}) \cdot year_{ij} + \varepsilon_{ij} \]
oder
\[ \widehat{gdp}_{ij} = fixed\_intercept + fixed\_slope\_year \cdot year_{ij} + random\_intercept\_country_{i} + random\_slope\_country_{i} \cdot year_{ij} + \varepsilon_{ij} \] i country j year
It ist basically the same example that you can find in our sheet_repeated_measure.
require(tidyverse)
## Lade nötiges Paket: tidyverse
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
## ✔ tibble 3.1.6 ✔ dplyr 1.0.8
## ✔ readr 2.1.2 ✔ stringr 1.4.0
## ✔ purrr 0.3.4 ✔ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ psych::%+%() masks ggplot2::%+%()
## ✖ psych::alpha() masks ggplot2::alpha()
## ✖ dplyr::arrange() masks plyr::arrange()
## ✖ dplyr::collapse() masks nlme::collapse()
## ✖ purrr::compact() masks plyr::compact()
## ✖ dplyr::count() masks plyr::count()
## ✖ Matrix::expand() masks tidyr::expand()
## ✖ dplyr::failwith() masks plyr::failwith()
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::id() masks plyr::id()
## ✖ dplyr::lag() masks stats::lag()
## ✖ dplyr::mutate() masks plyr::mutate()
## ✖ Matrix::pack() masks tidyr::pack()
## ✖ dplyr::recode() masks car::recode()
## ✖ dplyr::rename() masks plyr::rename()
## ✖ purrr::some() masks car::some()
## ✖ dplyr::summarise() masks plyr::summarise()
## ✖ dplyr::summarize() masks plyr::summarize()
## ✖ Matrix::unpack() masks tidyr::unpack()
gdp_wide <- read_delim("https://md.psych.bio.uni-goettingen.de/mv/data/div/gdp.dat", "\t")
## Rows: 34 Columns: 10
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: "\t"
## dbl (10): year, austria, canada, france, germany, greece, italy, sweden, uk,...
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# gdp <- gather(gdp_wide, country, gdp, austria:usa)
gdp <- gdp_wide %>% tidyr::pivot_longer(names_to="country", values_to="gdp", cols=austria:usa)
gdp <- gdp %>% mutate(country_fac = factor(country)) %>%
mutate(year_fac = factor(year)) %>%
mutate(year_nr = year - 1949)
require(nlme)
m1a <- lm(gdp ~ year, data = gdp)
# we better use interval scaled year_nr, so we get a slope and not a lot of dummy variable effects
m1 <- gls(gdp ~ year_nr, data = gdp, method = 'ML')
m2 <- lme(gdp ~ year_nr, data = gdp,
random = ~1 | country_fac,
method = "ML")
m3 <- lme(gdp ~ year_nr, data = gdp,
random = ~1 + year_nr | country_fac,
method ="ML")
head(gdp[,c("year", "year_nr")])
## # A tibble: 6 × 2
## year year_nr
## <dbl> <dbl>
## 1 1950 1
## 2 1950 1
## 3 1950 1
## 4 1950 1
## 5 1950 1
## 6 1950 1
summary(m1)
## Generalized least squares fit by maximum likelihood
## Model: gdp ~ year_nr
## Data: gdp
## AIC BIC logLik
## 2619.455 2630.625 -1306.727
##
## Coefficients:
## Value Std.Error t-value p-value
## (Intercept) 5.229306 2.0304781 2.575406 0.0105
## year_nr 0.519281 0.1012081 5.130823 0.0000
##
## Correlation:
## (Intr)
## year_nr -0.872
##
## Standardized residuals:
## Min Q1 Med Q3 Max
## -1.3156590 -0.6688336 -0.2860071 0.5265292 3.7327691
##
## Residual standard error: 17.31221
## Degrees of freedom: 306 total; 304 residual
summary(m2)
## Linear mixed-effects model fit by maximum likelihood
## Data: gdp
## AIC BIC logLik
## 2136.28 2151.175 -1064.14
##
## Random effects:
## Formula: ~1 | country_fac
## (Intercept) Residual
## StdDev: 15.71083 7.272041
##
## Fixed effects: gdp ~ year_nr
## Value Std.Error DF t-value p-value
## (Intercept) 5.229306 5.322917 296 0.982414 0.3267
## year_nr 0.519281 0.042513 296 12.214712 0.0000
## Correlation:
## (Intr)
## year_nr -0.14
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -3.260696450 -0.361469341 -0.008045481 0.375195313 3.882753955
##
## Number of Observations: 306
## Number of Groups: 9
summary(m3)
## Linear mixed-effects model fit by maximum likelihood
## Data: gdp
## AIC BIC logLik
## 1234.862 1257.203 -611.431
##
## Random effects:
## Formula: ~1 + year_nr | country_fac
## Structure: General positive-definite, Log-Cholesky parametrization
## StdDev Corr
## (Intercept) 4.6909664 (Intr)
## year_nr 0.7146444 0.596
## Residual 1.4955014
##
## Fixed effects: gdp ~ year_nr
## Value Std.Error DF t-value p-value
## (Intercept) 5.229306 1.578566 296 3.312694 0.0010
## year_nr 0.519281 0.239157 296 2.171298 0.0307
## Correlation:
## (Intr)
## year_nr 0.588
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -4.368512622 -0.123322614 -0.008279835 0.134552897 4.812089353
##
## Number of Observations: 306
## Number of Groups: 9
# we can see that models go better AIC and BIC go down
cat(AIC(m1), AIC(m2), AIC(m3))
## 2619.455 2136.28 1234.862
cat(BIC(m1), BIC(m2), BIC(m3))
## 2630.625 2151.175 1257.203
# also residuals are becoming smaller
cat(sd(m1$residuals), sd(m2$residuals), sd(m3$residuals))
## 17.34057 13.25941 12.29458
# do random slopes matter?
anova(m2, m3) # ... yes
## Model df AIC BIC logLik Test L.Ratio p-value
## m2 1 4 2136.280 2151.175 -1064.140
## m3 2 6 1234.862 1257.204 -611.431 1 vs 2 905.4185 <.0001
head(unique(model.matrix(m3, data=gdp)))
## (Intercept) year_nr
## 1 1 1
## 10 1 2
## 19 1 3
## 28 1 4
## 37 1 5
## 46 1 6
# we take a closer look at what the model offers
coef(m3)
## (Intercept) year_nr
## austria 0.07874101 7.149804e-05
## canada 3.04815056 1.584289e-01
## france 8.07967323 7.353456e-01
## germany 5.56936211 4.489284e-01
## greece 9.16849713 2.387432e+00
## italy 0.67333435 6.269555e-02
## sweden 15.28436854 7.283411e-01
## uk 1.04723659 3.083183e-02
## usa 4.11438870 1.214543e-01
fixef(m3)
## (Intercept) year_nr
## 5.229306 0.519281
ranef(m3)
## (Intercept) year_nr
## austria -5.1505648 -0.51920949
## canada -2.1811552 -0.36085213
## france 2.8503674 0.21606459
## germany 0.3400563 -0.07035258
## greece 3.9391913 1.86815075
## italy -4.5559715 -0.45658543
## sweden 10.0550627 0.20906015
## uk -4.1820692 -0.48844916
## usa -1.1149171 -0.39782669
gdp$pred_m1 <- predict(m1)
gdp$pred_m2 <- predict(m2)
gdp$pred_m3 <- predict(m3)
ncountries <- length(levels(gdp$country_fac))
nyears <- 34
# we calculate predicted value of first year of Austria manually using coefficients
fixef(m3)[1] + ranef(m3)[[1]][1] + (fixef(m3)[2] + ranef(m3)[[2]][1]) * gdp$year_nr[1]
## (Intercept)
## 0.07881251
gdp_plot_1 <- ggplot(data = gdp, aes(year, gdp,
group = country_fac,
#shape= country_fac,
color = country_fac
)) +
# we scale x back to years
scale_x_continuous(breaks = c(1, 11, 21, 31), label = c("1950", "1960", "1970", "1980")) +
geom_line(legend=FALSE) +
geom_point(legend=FALSE) +
labs(x = "Year", y = "GDP per capita", color = "Country")
## Warning: Ignoring unknown parameters: legend
## Ignoring unknown parameters: legend
( gdp_plot_m1 <- gdp_plot_1 + geom_line(aes(y = pred_m1), size=2) )
( gdp_plot_m2 <- gdp_plot_1 + geom_line(aes(y = pred_m2)) )
( gdp_plot_m3 <- gdp_plot_1 + geom_line(aes(y = pred_m3)) )
( gdp_plot_m <- gdp_plot_1 + geom_line(aes(y = pred_m3)) + geom_line(aes(y = pred_m1), group=1, size=2, color="black", linetype="dashed") )
# details on variances in random effects
VarCorr(m3)
## country_fac = pdLogChol(1 + year_nr)
## Variance StdDev Corr
## (Intercept) 22.0051653 4.6909664 (Intr)
## year_nr 0.5107166 0.7146444 0.596
## Residual 2.2365244 1.4955014
# importance of level variable, here country
#require(ICC) # also returns confidence intervall
#ICCest(x=country_fac, y=gdp, data=gdp)
A Screencast can be found in lm_cat_mixed. It explains not only pure repeated measure ANOVA models, but also.
Version: 16 Juni, 2022 13:42