Aus Andy Fields, Exploring statistics using R (2012) chapter 13.4.7.2
Vergleich der Anwendung verschiedener Pakete: - aov() mit Errorterm - nlme::lme() - lme4::lmer()
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.0 ──
## ✓ ggplot2 3.3.3 ✓ purrr 0.3.4
## ✓ tibble 3.1.0 ✓ dplyr 1.0.5
## ✓ tidyr 1.1.3 ✓ stringr 1.4.0
## ✓ readr 1.4.0 ✓ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
dd.u <- read_tsv("http://md.psych.bio.uni-goettingen.de/mv/data/af/af-bushtucker-universal.txt")[1:8,1:4]
##
## ── Column specification ────────────────────────────────────────────────────────
## cols(
## stick = col_double(),
## testicle = col_double(),
## eye = col_double(),
## witchetty = col_double(),
## food = col_double(),
## seconds = col_double()
## )
dd.u$subj <- 1:nrow(dd.u)
dd.l <- dd.u %>% gather(key = "animal", value="seconds", stick:witchetty)
dd.l$f.animal <- factor(dd.l$animal)
dd.l$f.subj <- factor(dd.l$subj)
dd.l <- dd.l %>% group_by(f.subj) %>% mutate(subj.ms = mean(seconds, na.rm=T))
dd <- dd.l[c("f.subj", "f.animal", "seconds", "subj.ms")]
# independent group ANOVA
# lm(seconds ~ f.animal, data=dd) # would also show dummy effects in summary
ma.0 <- aov( seconds ~ f.animal, data=dd)
# repeated measures ANOVA
ma.r <- aov( seconds ~ f.animal + Error(f.subj/f.animal), data=dd)
summary(ma.0)
## Df Sum Sq Mean Sq F value Pr(>F)
## f.animal 3 83.12 27.708 4.544 0.0102 *
## Residuals 28 170.75 6.098
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
summary(ma.r)
##
## Error: f.subj
## Df Sum Sq Mean Sq F value Pr(>F)
## Residuals 7 17.38 2.482
##
## Error: f.subj:f.animal
## Df Sum Sq Mean Sq F value Pr(>F)
## f.animal 3 83.13 27.708 3.794 0.0256 *
## Residuals 21 153.37 7.304
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# ... residuals go down and sig goes up
# using lme()
library(nlme)
##
## Attaching package: 'nlme'
## The following object is masked from 'package:dplyr':
##
## collapse
mn.0 <- lme(seconds ~ 1, random = ~1|f.subj, data = dd, method = "ML") # random intercept of subjects already included
mn.a <- lme(seconds ~ f.animal, random = ~1|f.subj, data = dd, method = "ML") #
anova(mn.a)
## numDF denDF F-value p-value
## (Intercept) 1 21 162.36310 <.0001
## f.animal 3 21 4.54368 0.0132
anova(mn.0, mn.a)
## Model df AIC BIC logLik Test L.Ratio p-value
## mn.0 1 3 163.0875 167.4847 -78.54373
## mn.a 2 6 156.3949 165.1893 -72.19747 1 vs 2 12.69253 0.0054
# using lme4() or better lmerTest
library(lmerTest)
## Loading required package: lme4
## Loading required package: Matrix
##
## Attaching package: 'Matrix'
## The following objects are masked from 'package:tidyr':
##
## expand, pack, unpack
##
## Attaching package: 'lme4'
## The following object is masked from 'package:nlme':
##
## lmList
##
## Attaching package: 'lmerTest'
## The following object is masked from 'package:lme4':
##
## lmer
## The following object is masked from 'package:stats':
##
## step
m4.0 <- lmer(seconds ~ (1|f.subj), data=dd, REML=F) # random intercept of subjects already included
## boundary (singular) fit: see ?isSingular
m4.1 <- lmer(seconds ~ f.animal + (1|f.subj), data=dd, REML=F) #
## boundary (singular) fit: see ?isSingular
anova(m4.1)
## Type III Analysis of Variance Table with Satterthwaite's method
## Sum Sq Mean Sq NumDF DenDF F value Pr(>F)
## f.animal 83.125 27.708 3 32 5.1928 0.004899 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
anova(m4.0, m4.1)
## Data: dd
## Models:
## m4.0: seconds ~ (1 | f.subj)
## m4.1: seconds ~ f.animal + (1 | f.subj)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## m4.0 3 163.09 167.49 -78.544 157.09
## m4.1 6 156.40 165.19 -72.197 144.40 12.693 3 0.005351 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# lme() model proposed in Field (2012) is not adequate
mf.n <- lme(seconds ~ f.animal, random = ~1|f.subj/f.animal, data = dd, method = "ML")
summary(mf.n)
## Linear mixed-effects model fit by maximum likelihood
## Data: dd
## AIC BIC logLik
## 158.3949 168.6551 -72.19747
##
## Random effects:
## Formula: ~1 | f.subj
## (Intercept)
## StdDev: 7.756262e-05
##
## Formula: ~1 | f.animal %in% f.subj
## (Intercept) Residual
## StdDev: 2.309935 0.01175985
##
## Fixed effects: seconds ~ f.animal
## Value Std.Error DF t-value p-value
## (Intercept) 4.125 0.8730846 21 4.724628 0.0001
## f.animalstick 4.000 1.2347281 21 3.239580 0.0039
## f.animaltesticle 0.125 1.2347281 21 0.101237 0.9203
## f.animalwitchetty 1.625 1.2347281 21 1.316079 0.2023
## Correlation:
## (Intr) f.nmls f.nmlt
## f.animalstick -0.707
## f.animaltesticle -0.707 0.500
## f.animalwitchetty -0.707 0.500 0.500
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -0.0104685054 -0.0046832787 0.0001377435 0.0041323048 0.0085400965
##
## Number of Observations: 32
## Number of Groups:
## f.subj f.animal %in% f.subj
## 8 32
# this can be clearly seen by comparing
anova(mn.a, mf.n)
## Model df AIC BIC logLik Test L.Ratio p-value
## mn.a 1 6 156.3949 165.1893 -72.19747
## mf.n 2 7 158.3949 168.6551 -72.19747 1 vs 2 8.718246e-09 0.9999
# ... the two models are nested by allowing more random effects, in fact too much
# confusion might be a result of doing it previously using aov() like shown above
mf.a <- aov(seconds ~ f.animal + Error(f.subj/f.animal), data = dd)
summary(mf.a)
##
## Error: f.subj
## Df Sum Sq Mean Sq F value Pr(>F)
## Residuals 7 17.38 2.482
##
## Error: f.subj:f.animal
## Df Sum Sq Mean Sq F value Pr(>F)
## f.animal 3 83.13 27.708 3.794 0.0256 *
## Residuals 21 153.37 7.304
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Model mf.n ist so, wie von Andy Field in Kapitel 13.4.7.2 auf Buchseite 574 vorgeschlagen. Es macht aber keinen Sinn, da mit subj/f.animal die gesamte Varianz aufgeklärt ist. Die Residuals sinken auf 0, der Model-Vergleich wird nicht signifikant, weil es viel zu viele Parameter gibt. Die Informationskriterien AIC und BIC sinken nicht, sondern steigen.
A second second level variable
# dd data object above has to be available
# we generate some data and add a drink variable with 4 levels
psych::describe(dd$subj.ms)
## vars n mean sd median trimmed mad min max range skew kurtosis se
## X1 1 32 5.56 0.75 5.38 5.55 1.11 4.5 6.75 2.25 0.16 -1.34 0.13
set.seed(17)
# cor(dd$subj.ms, dd$subj.ms + rnorm(nrow(dd)))
dd$x <- dd$subj.ms + rnorm(nrow(dd))
x <- sort(dd$x)
gar <- split(x, ceiling(seq_along(x)/(length(x)/4)))
# dd$drink <- split(x, ceiling(seq_along(x)/8))
dd$drink <- NA
dd$drink[dd$x %in% gar[[1]]] <- "a"
dd$drink[dd$x %in% gar[[2]]] <- "b"
dd$drink[dd$x %in% gar[[3]]] <- "c"
dd$drink[dd$x %in% gar[[4]]] <- "d"
dd$f.drink <- factor(dd$drink)
mn.ad <- nlme::lme(seconds ~ f.animal + f.drink, random = ~1|f.subj, data = dd, method = "ML")
anova(mn.a, mn.ad)
## Model df AIC BIC logLik Test L.Ratio p-value
## mn.a 1 6 156.3949 165.1893 -72.19747
## mn.ad 2 9 158.5644 171.7560 -70.28219 1 vs 2 3.830546 0.2804
summary(mn.ad)
## Linear mixed-effects model fit by maximum likelihood
## Data: dd
## AIC BIC logLik
## 158.5644 171.756 -70.28219
##
## Random effects:
## Formula: ~1 | f.subj
## (Intercept) Residual
## StdDev: 5.21812e-05 2.175764
##
## Fixed effects: seconds ~ f.animal + f.drink
## Value Std.Error DF t-value p-value
## (Intercept) 3.968332 1.351579 18 2.9360706 0.0088
## f.animalstick 4.007468 1.311113 18 3.0565391 0.0068
## f.animaltesticle 0.071182 1.282063 18 0.0555214 0.9563
## f.animalwitchetty 1.807735 1.347709 18 1.3413395 0.1965
## f.drinkb -0.213028 1.382581 18 -0.1540797 0.8793
## f.drinkc -0.698826 1.311113 18 -0.5330021 0.6006
## f.drinkd 1.402141 1.347709 18 1.0403885 0.3119
## Correlation:
## (Intr) f.nmls f.nmlt f.nmlw f.drnkb f.drnkc
## f.animalstick -0.645
## f.animaltesticle -0.414 0.496
## f.animalwitchetty -0.704 0.564 0.467
## f.drinkb -0.720 0.328 0.057 0.391
## f.drinkc -0.537 0.088 -0.185 0.156 0.539
## f.drinkd -0.611 0.156 -0.121 0.277 0.573 0.564
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -1.87393523 -0.69863592 -0.01218907 0.82751025 1.68185133
##
## Number of Observations: 32
## Number of Groups: 8
# we do it using lmerTest
m4.ad <- lmer(seconds ~ f.animal + f.drink + (1|f.subj), data=dd, REML=F) #
## boundary (singular) fit: see ?isSingular
anova(m4.1, m4.ad)
## Data: dd
## Models:
## m4.1: seconds ~ f.animal + (1 | f.subj)
## m4.ad: seconds ~ f.animal + f.drink + (1 | f.subj)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## m4.1 6 156.40 165.19 -72.197 144.40
## m4.ad 9 158.56 171.76 -70.282 140.56 3.8305 3 0.2804
summary(m4.ad)
## Linear mixed model fit by maximum likelihood . t-tests use Satterthwaite's
## method [lmerModLmerTest]
## Formula: seconds ~ f.animal + f.drink + (1 | f.subj)
## Data: dd
##
## AIC BIC logLik deviance df.resid
## 158.6 171.8 -70.3 140.6 23
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -1.87394 -0.69864 -0.01219 0.82751 1.68185
##
## Random effects:
## Groups Name Variance Std.Dev.
## f.subj (Intercept) 0.000 0.000
## Residual 4.734 2.176
## Number of obs: 32, groups: f.subj, 8
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) 3.96833 1.19464 32.00000 3.322 0.00225 **
## f.animalstick 4.00747 1.15887 32.00000 3.458 0.00156 **
## f.animaltesticle 0.07118 1.13319 32.00000 0.063 0.95030
## f.animalwitchetty 1.80774 1.19122 32.00000 1.518 0.13895
## f.drinkb -0.21303 1.22204 32.00000 -0.174 0.86271
## f.drinkc -0.69883 1.15887 32.00000 -0.603 0.55074
## f.drinkd 1.40214 1.19122 32.00000 1.177 0.24785
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr) f.nmls f.nmlt f.nmlw f.drnkb f.drnkc
## f.animlstck -0.645
## f.anmltstcl -0.414 0.496
## f.nmlwtchtt -0.704 0.564 0.467
## f.drinkb -0.720 0.328 0.057 0.391
## f.drinkc -0.537 0.088 -0.185 0.156 0.539
## f.drinkd -0.611 0.156 -0.121 0.277 0.573 0.564
## optimizer (nloptwrap) convergence code: 0 (OK)
## boundary (singular) fit: see ?isSingular
Pinheiro & Bates (2000)
Comparison of repeated measures designs using different packages source
Version: 03 Juni, 2021 13:49