Es gibt ein paar Standard-Kontraste, die car-Package definiert sind, darunter auch der oben verwendete car::contr.Sum()
und car::contr.Helmert()
.
Keine Alpha-Inflation - Gesamt-Risiko wird eingehalten. Keine multiplen Vergleiche (wie bei multiplen T-Tests). Daher auch keine Alpha Inflation, die korrigiert werden muss. Allerdings gibt es hierzu unterschiedliche Meinungen (Vortrag Daniel Lakens, Juni 2017).
Aufteilen der erkärten Varianz. Hypothesen vorher. Kontraste -> dummy-Variablen -> stat. Modell Kontraste drücken Hypothesen in Mittelwerten bzw. Mittelwertsunterschieden aus. Die aufgrund der Kontraste generierten Dummy-Variablen realisieren/sind das Modell.
Standard-Kontraste existieren sowohl im Base-System als beispielsweise auch im package(car)
. im package(car)
beginnen die Typen (Namen) der Kontraste mit einem Großbuchstaben. contr.sum()
vs. contr.Sum()
library(car)
Design Matrix: Die Koeffizienten kann man sich für einen bestimmten Faktor ausgeben lassen. Intern wird eine entsprechende Menge an Dummy-Variablen gebildet.
Kontraste können als Eigenschaft(property) an eine Variable vom Typ factor()
angehängt werden. Beispiel: im Data-Frame die vierstufige Variable V1 (Typ factor) mit einem referenzkodierten Kontrast versehen, dabei die Stufe 3 als Referenzkategorie wählen. contrasts(dataframe$V1) <- contr.Treatment(4, base=3)
Danach aov()
aufrufen.
Achtung in obigem Beispiel gibt es Bezüge auf die Reihenfolge der Faktorstufen. Diese Reihenfolge richtet sich nach der alphabetischen Reihenfolge der Faktorstufen-Bezeichnungen (levels). Die Reihenfolge kann man sich über den Befehl levels()
ausgeben lassen. Für obigen Faktor V1 würde der Befehl lauten: levels(dataframe$V1)
. Transparenter wird das Ganze, wenn man beides miteinander verbindet: contrasts(dataframe$V1) <- contr.Treatment(levels(dataframe$V1), base=3)
. Hier wird der Bezug auf den dritten Level des factor V1 unmittelbar klar.
Kontraste werden in Faktorstufen -1 Dummy-Variablen umgesetzt, die man sich ausgeben lassen kann model.matrix()
. Hierüber hat man eine Kontrollmöglichkeit über das Vorgehen der Berechnung (“Welche Faktorstufe ist Referenzgruppe?”).
Eigendefinierte Spezialkontraste können über den Weg der Speicherung als Eigenschaft des Faktors ebenfalls spezifiziert werden. contrasts(dataframe$V1) <- cbind(c(-1,0,-1,+2), c(-1,2,-2,0), c(-1,1,-1,1))
. Kontrolliert werden kann der mit einem Faktor verknüpfte Kontrast über die Ausgabe der Variablen bzw. contrasts(FaktorVariablenName)
Je nach Art des Kontrastes müssen die Koeffizienten/Gewichte (coefficients) unterschiedlich interpretiert werden. Die varianzanalytischen aggregierten Tests sind identisch.
Es werden hier verschiedene Arten von Kontrasten demonstriert, die nicht unbedingt alle inhaltlich Sinn machen.
-Default Einstellung ist contr.Treatment mit der ersten Faktorstufe als Referenzstufe. -Hier ist das die Gruppe mit dem Label ‘av’ (average student), die eher zufällig auch inhaltlich die adäquate Vergleichsgruppe ist.
# get data
# dd <- read.delim("https://md.psych.bio.uni-goettingen.de/mv/data/virt/v_stait_exam_wide.txt")[c('subj','gender','group', 'neutr')]
dd <- readr::read_tsv("https://md.psych.bio.uni-goettingen.de/mv/data/virt/v_stait_exam_wide.txt")[c('subj','gender','group', 'neutr')]
## Rows: 180 Columns: 6
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: "\t"
## chr (2): gender, group
## dbl (4): subj, neutr, exam1, exam2
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
dd$gender <- factor(dd$gender)
dd$group <- factor(dd$group)
# although existing in the base system in car package more transparent
require(car)
## Lade nötiges Paket: car
## Lade nötiges Paket: carData
# levels of factor group
levels(dd$group)
## [1] "av" "hi" "lo"
# show design matrices for some standard contrasts
# standard contrasts
contr.Treatment(levels(dd$group))
## [T.hi] [T.lo]
## av 0 0
## hi 1 0
## lo 0 1
# refernce group is 'lo' group, default is first level (group 1)
contr.Treatment(levels(dd$group), base=3)
## [T.av] [T.hi]
## av 1 0
## hi 0 1
## lo 0 0
# compare each group with last group, like SAS does/did
contr.SAS(levels(dd$group))
## av hi
## av 1 0
## hi 0 1
## lo 0 0
# compare each group with mean effect of all subsequent categories, except last group
contr.Helmert(levels(dd$group))
## [H.1] [H.2]
## av -1 -1
## hi 1 -1
## lo 0 2
# contr.Sum effect coding
contr.Sum(levels(dd$group))
## [S.av] [S.hi]
## av 1 0
## hi 0 1
## lo -1 -1
# polynomial contrasts
contr.poly(levels(dd$group))
## .L .Q
## [1,] -7.071068e-01 0.4082483
## [2,] -7.850462e-17 -0.8164966
## [3,] 7.071068e-01 0.4082483
# do an analysis with a special contrast
unique(model.matrix(aov(neutr ~ group, data=dd, contrasts = list(group = contr.Treatment))))
## (Intercept) group[T.2] group[T.3]
## 1 1 0 0
## 61 1 0 1
## 121 1 1 0
coefficients(aov(neutr ~ group, data=dd, contrasts = list(group = contr.Treatment)))
## (Intercept) group[T.2] group[T.3]
## 43.48333 11.03333 -9.95000
unique(model.matrix(aov(neutr ~ group, data=dd, contrasts = list(group = contr.Sum))))
## (Intercept) group[S.1] group[S.2]
## 1 1 1 0
## 61 1 -1 -1
## 121 1 0 1
coefficients(aov(neutr ~ group, data=dd, contrasts = list(group = contr.Sum)))
## (Intercept) group[S.1] group[S.2]
## 43.8444444 -0.3611111 10.6722222
unique(model.matrix(aov(neutr ~ group, data=dd, contrasts = list(group = contr.poly))))
## (Intercept) group.L group.Q
## 1 1 -7.071068e-01 0.4082483
## 61 1 7.071068e-01 0.4082483
## 121 1 -7.850462e-17 -0.8164966
coefficients(aov(neutr ~ group, data=dd, contrasts = list(group = contr.poly)))
## (Intercept) group.L group.Q
## 43.844444 -7.035712 -13.070749
unique(model.matrix(aov(neutr ~ group, data=dd)))
## (Intercept) grouphi grouplo
## 1 1 0 0
## 61 1 0 1
## 121 1 1 0
coefficients(aov(neutr ~ group, data=dd))
## (Intercept) grouphi grouplo
## 43.48333 11.03333 -9.95000
require(Rmisc)
## Lade nötiges Paket: Rmisc
## Lade nötiges Paket: lattice
## Lade nötiges Paket: plyr
Rmisc::summarySE(data=dd, "neutr")
## .id N neutr sd se ci
## 1 <NA> 180 43.84444 14.96694 1.11557 2.20136
Rmisc::summarySE(data=dd, "neutr", groupvars="group")
## group N neutr sd se ci
## 1 av 60 43.48333 12.34324 1.593506 3.188598
## 2 hi 60 54.51667 13.18897 1.702688 3.407072
## 3 lo 60 33.53333 11.36821 1.467629 2.936720
# you may save a specific contrast for a factor
contrasts(dd$group) <- contr.Treatment(3, base=2)
# and control the current settings
dd$group['contrasts']
## [1] <NA>
## attr(,"contrasts")
## [T.1] [T.3]
## av 1 0
## hi 0 0
## lo 0 1
## Levels: av hi lo
# you may define your own contrasts also
# compare all groups to the lo goup
levels(dd$group)
## [1] "av" "hi" "lo"
c1 <- c(1, 1, -2)
c2 <- c(-1, 1, 0)
contrasts(dd$group) <- cbind(c1, c2)
# check the design matrix
dd$group
## [1] av av av av av av av av av av av av av av av av av av av av av av av av av
## [26] av av av av av av av av av av av av av av av av av av av av av av av av av
## [51] av av av av av av av av av av lo lo lo lo lo lo lo lo lo lo lo lo lo lo lo
## [76] lo lo lo lo lo lo lo lo lo lo lo lo lo lo lo lo lo lo lo lo lo lo lo lo lo
## [101] lo lo lo lo lo lo lo lo lo lo lo lo lo lo lo lo lo lo lo lo hi hi hi hi hi
## [126] hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi
## [151] hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi hi
## [176] hi hi hi hi hi
## attr(,"contrasts")
## c1 c2
## av 1 -1
## hi 1 1
## lo -2 0
## Levels: av hi lo
dd$group['contrasts']
## [1] <NA>
## attr(,"contrasts")
## c1 c2
## av 1 -1
## hi 1 1
## lo -2 0
## Levels: av hi lo
dd$group[2][1]
## [1] av
## attr(,"contrasts")
## c1 c2
## av 1 -1
## hi 1 1
## lo -2 0
## Levels: av hi lo
model.matrix(aov(neutr ~ group, data=dd))
## (Intercept) groupc1 groupc2
## 1 1 1 -1
## 2 1 1 -1
## 3 1 1 -1
## 4 1 1 -1
## 5 1 1 -1
## 6 1 1 -1
## 7 1 1 -1
## 8 1 1 -1
## 9 1 1 -1
## 10 1 1 -1
## 11 1 1 -1
## 12 1 1 -1
## 13 1 1 -1
## 14 1 1 -1
## 15 1 1 -1
## 16 1 1 -1
## 17 1 1 -1
## 18 1 1 -1
## 19 1 1 -1
## 20 1 1 -1
## 21 1 1 -1
## 22 1 1 -1
## 23 1 1 -1
## 24 1 1 -1
## 25 1 1 -1
## 26 1 1 -1
## 27 1 1 -1
## 28 1 1 -1
## 29 1 1 -1
## 30 1 1 -1
## 31 1 1 -1
## 32 1 1 -1
## 33 1 1 -1
## 34 1 1 -1
## 35 1 1 -1
## 36 1 1 -1
## 37 1 1 -1
## 38 1 1 -1
## 39 1 1 -1
## 40 1 1 -1
## 41 1 1 -1
## 42 1 1 -1
## 43 1 1 -1
## 44 1 1 -1
## 45 1 1 -1
## 46 1 1 -1
## 47 1 1 -1
## 48 1 1 -1
## 49 1 1 -1
## 50 1 1 -1
## 51 1 1 -1
## 52 1 1 -1
## 53 1 1 -1
## 54 1 1 -1
## 55 1 1 -1
## 56 1 1 -1
## 57 1 1 -1
## 58 1 1 -1
## 59 1 1 -1
## 60 1 1 -1
## 61 1 -2 0
## 62 1 -2 0
## 63 1 -2 0
## 64 1 -2 0
## 65 1 -2 0
## 66 1 -2 0
## 67 1 -2 0
## 68 1 -2 0
## 69 1 -2 0
## 70 1 -2 0
## 71 1 -2 0
## 72 1 -2 0
## 73 1 -2 0
## 74 1 -2 0
## 75 1 -2 0
## 76 1 -2 0
## 77 1 -2 0
## 78 1 -2 0
## 79 1 -2 0
## 80 1 -2 0
## 81 1 -2 0
## 82 1 -2 0
## 83 1 -2 0
## 84 1 -2 0
## 85 1 -2 0
## 86 1 -2 0
## 87 1 -2 0
## 88 1 -2 0
## 89 1 -2 0
## 90 1 -2 0
## 91 1 -2 0
## 92 1 -2 0
## 93 1 -2 0
## 94 1 -2 0
## 95 1 -2 0
## 96 1 -2 0
## 97 1 -2 0
## 98 1 -2 0
## 99 1 -2 0
## 100 1 -2 0
## 101 1 -2 0
## 102 1 -2 0
## 103 1 -2 0
## 104 1 -2 0
## 105 1 -2 0
## 106 1 -2 0
## 107 1 -2 0
## 108 1 -2 0
## 109 1 -2 0
## 110 1 -2 0
## 111 1 -2 0
## 112 1 -2 0
## 113 1 -2 0
## 114 1 -2 0
## 115 1 -2 0
## 116 1 -2 0
## 117 1 -2 0
## 118 1 -2 0
## 119 1 -2 0
## 120 1 -2 0
## 121 1 1 1
## 122 1 1 1
## 123 1 1 1
## 124 1 1 1
## 125 1 1 1
## 126 1 1 1
## 127 1 1 1
## 128 1 1 1
## 129 1 1 1
## 130 1 1 1
## 131 1 1 1
## 132 1 1 1
## 133 1 1 1
## 134 1 1 1
## 135 1 1 1
## 136 1 1 1
## 137 1 1 1
## 138 1 1 1
## 139 1 1 1
## 140 1 1 1
## 141 1 1 1
## 142 1 1 1
## 143 1 1 1
## 144 1 1 1
## 145 1 1 1
## 146 1 1 1
## 147 1 1 1
## 148 1 1 1
## 149 1 1 1
## 150 1 1 1
## 151 1 1 1
## 152 1 1 1
## 153 1 1 1
## 154 1 1 1
## 155 1 1 1
## 156 1 1 1
## 157 1 1 1
## 158 1 1 1
## 159 1 1 1
## 160 1 1 1
## 161 1 1 1
## 162 1 1 1
## 163 1 1 1
## 164 1 1 1
## 165 1 1 1
## 166 1 1 1
## 167 1 1 1
## 168 1 1 1
## 169 1 1 1
## 170 1 1 1
## 171 1 1 1
## 172 1 1 1
## 173 1 1 1
## 174 1 1 1
## 175 1 1 1
## 176 1 1 1
## 177 1 1 1
## 178 1 1 1
## 179 1 1 1
## 180 1 1 1
## attr(,"assign")
## [1] 0 1 1
## attr(,"contrasts")
## attr(,"contrasts")$group
## c1 c2
## av 1 -1
## hi 1 1
## lo -2 0
coefficients(aov(neutr ~ group, data=dd))
## (Intercept) groupc1 groupc2
## 43.844444 5.155556 5.516667
summary(aov(neutr ~ group, data=dd))
## Df Sum Sq Mean Sq F value Pr(>F)
## group 2 13221 6610 43.53 4.21e-16 ***
## Residuals 177 26877 152
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
In R können die für ein Modell erzeugten Kontrastkoeffizienten bzw. Dummy-Variablen mit dem Befehl model.matrix()
generiert werden. Die Design-Matrix (contr.()) zeigt das Prinzip hinter der Dummy-Codierung. Dadurch hat man Zugriff hierauf und kann sie beispielsweise zu einem Data-Frame hinzufügen oder speichern. Analyse der Dummy-Matrix bzw. die Kontrolle der contrast-settings eines factors sind eine sichere Möglichkeit, das berechnete Modell nachzuvollziehen.
# get data
dd <- read.delim("https://md.psych.bio.uni-goettingen.de/mv/data/virt/v_stait_exam_wide.txt")[c('subj','gender','group', 'neutr')]
m.anx <- aov(neutr ~ group, data=dd)
dd <- cbind(dd, model.matrix(neutr ~ group, data=dd))
head(dd)
## subj gender group neutr (Intercept) grouphi grouplo
## 1 1 f av 42 1 0 0
## 2 2 f av 59 1 0 0
## 3 3 f av 49 1 0 0
## 4 4 f av 20 1 0 0
## 5 5 f av 45 1 0 0
## 6 6 f av 39 1 0 0
# generate effect coded contrast matrix
mm.e <- model.matrix(aov(neutr ~ group, data=dd, contrasts = list(group = contr.Sum)))
R has a “sub-language” to translate formulas into design matrix, and in the spirit of the language you can take advantage of it.
Kontraste können auch als Default gesetzt sein und verändert werden
# get data
#dd <- read.delim("https://md.psych.bio.uni-goettingen.de/mv/data/virt/v_stait_exam_wide.txt")[c('subj','gender','group', 'neutr')]
dd <- readr::read_tsv("https://md.psych.bio.uni-goettingen.de/mv/data/virt/v_stait_exam_wide.txt")[c('subj','gender','group', 'neutr')]
## Rows: 180 Columns: 6
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: "\t"
## chr (2): gender, group
## dbl (4): subj, neutr, exam1, exam2
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
dd$gender <- factor(dd$gender)
dd$group <- factor(dd$group)
default.options <- options('contrasts')
default.options
## $contrasts
## unordered ordered
## "contr.treatment" "contr.poly"
coefficients(aov(neutr ~ group, data=dd))
## (Intercept) grouphi grouplo
## 43.48333 11.03333 -9.95000
# with changed default options the same call to aov() produces different results
options(contrasts=c('contr.sum','contr.poly'))
coefficients(aov(neutr ~ group, data=dd))
## (Intercept) group1 group2
## 43.8444444 -0.3611111 10.6722222
# reset to old default
options(contrasts=default.options[['contrasts']])
Im folgenden Beispiel wird ein eigentlich 2*2 varianzanalytisches Design auch über einen Faktor mit 4 Faktorstufen gerechnet, wobei mit Kontrasten gearbeitet wird. Bei diesem Beispiel werden auch spezifische, nicht orthogonale Kontraste angegeben.
dd <- read.delim(header=F, text="
1 a1 b1 a1b1 10
2 a1 b1 a1b1 11
3 a1 b1 a1b1 9
4 a1 b1 a1b1 12
5 a1 b1 a1b1 10
6 a1 b2 a1b2 15
7 a1 b2 a1b2 17
8 a1 b2 a1b2 13
9 a1 b2 a1b2 14
10 a1 b2 a1b2 16
11 a2 b1 a2b1 9
12 a2 b1 a2b1 8
13 a2 b1 a2b1 10
14 a2 b1 a2b1 11
15 a2 b1 a2b1 12
16 a2 b2 a2b2 5
17 a2 b2 a2b2 7
18 a2 b2 a2b2 4
19 a2 b2 a2b2 6
20 a2 b2 a2b2 4")
colnames(dd) <- c('subj', 'A', 'B', 'AB','y')
head(dd)
## subj A B AB y
## 1 1 a1 b1 a1b1 10
## 2 2 a1 b1 a1b1 11
## 3 3 a1 b1 a1b1 9
## 4 4 a1 b1 a1b1 12
## 5 5 a1 b1 a1b1 10
## 6 6 a1 b2 a1b2 15
# we set type factor for A, B, and AB
dd$A <- factor(dd$A)
dd$B <- factor(dd$B)
dd$AB <- factor(dd$AB)
# graph it
require(ggplot2)
## Lade nötiges Paket: ggplot2
# create a plot object, define dataframe to use, add gender to differ in colour already in base plot
ggplot(dd, x=B, y=y, aes(B, y, fill = A)) +
stat_summary(fun=mean, geom="bar", position=position_dodge()) +
stat_summary(fun.data = mean_cl_normal, geom = "errorbar", position=position_dodge(width=0.9), width=0.2)
# the 'normal' 2*2 way, main effects only
summary(aov(y ~ A + B, data=dd))
## Df Sum Sq Mean Sq F value Pr(>F)
## A 1 130.05 130.05 15.520 0.00106 **
## B 1 0.05 0.05 0.006 0.93933
## Residuals 17 142.45 8.38
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# main effect B doesn't seem to exist
# the 'normal' way 2*2 Design full model including interaction
summary(aov(y ~ A * B, data=dd))
## Df Sum Sq Mean Sq F value Pr(>F)
## A 1 130.05 130.05 65.025 5.00e-07 ***
## B 1 0.05 0.05 0.025 0.876
## A:B 1 110.45 110.45 55.225 1.42e-06 ***
## Residuals 16 32.00 2.00
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# we find a significant interaction
# alternatively put as a single factor design
# create a plot object, define dataframe to use, add gender to differ in colour already in base plot
ggplot(dd, x=AB, y=y, aes(AB, y)) +
stat_summary(fun=mean, geom="bar", position=position_dodge()) +
stat_summary(fun.data = mean_cl_normal, geom = "errorbar", position=position_dodge(width=0.9), width=0.2)
require(ez)
## Lade nötiges Paket: ez
# descriptives - the ez way
ezStats(data=dd, dv=.(y), wid=(subj), between=.(AB))
## Warning: Converting "subj" to factor for ANOVA.
## Coefficient covariances computed by hccm()
## AB N Mean SD FLSD
## 1 a1b1 5 10.4 1.140175 1.896101
## 2 a1b2 5 15.0 1.581139 1.896101
## 3 a2b1 5 10.0 1.581139 1.896101
## 4 a2b2 5 5.2 1.303840 1.896101
# grand mean and more descriptives of y
require(psych)
## Lade nötiges Paket: psych
##
## Attache Paket: 'psych'
## Die folgenden Objekte sind maskiert von 'package:ggplot2':
##
## %+%, alpha
## Das folgende Objekt ist maskiert 'package:car':
##
## logit
psych::describe(dd$y)
## vars n mean sd median trimmed mad min max range skew kurtosis se
## X1 1 20 10.15 3.79 10 10.12 3.71 4 17 13 0.01 -1 0.85
require(car)
# standard contrasts may be called directly in aov()
# reference coding, reference group is 1 (a1b1), coefficients are differences to reference group
coefficients(aov(y ~ AB, data=dd, contrasts = list(AB = contr.Treatment)))
## (Intercept) AB[T.2] AB[T.3] AB[T.4]
## 10.4 4.6 -0.4 -5.2
# effect coding, coefficients are differences referred to grand mean
coefficients(aov(y ~ AB, data=dd, contrasts = list(AB = contr.Sum)))
## (Intercept) AB[S.1] AB[S.2] AB[S.3]
## 10.15 0.25 4.85 -0.15
# polynomial contrasts
coefficients(aov(y ~ AB, data=dd, contrasts = list(AB = contr.poly)))
## (Intercept) AB.L AB.Q AB.C
## 10.150000 -4.606300 -4.700000 2.191347
# we may look at the contrast coefficients
contr.treatment(levels(dd$AB), base=3)
## a1b1 a1b2 a2b2
## a1b1 1 0 0
## a1b2 0 1 0
## a2b1 0 0 0
## a2b2 0 0 1
# contrast names in package(car) are a bit more verbose
contr.Treatment(levels(dd$AB), base=3)
## [T.a1b1] [T.a1b2] [T.a2b2]
## a1b1 1 0 0
## a1b2 0 1 0
## a2b1 0 0 0
## a2b2 0 0 1
# another way to specify contrasts is to make them properties of factor variables
# specify a reference contrast, take group number 3 (a2b1) as reference group, coefficients are differences to reference group
contrasts(dd$AB) <- contr.treatment(4, base=3)
# or more verbose
contrasts(dd$AB) <- contr.Treatment(4, base=3)
coefficients(aov(y ~ AB, data=dd))
## (Intercept) AB[T.1] AB[T.2] AB[T.4]
## 10.0 0.4 5.0 -4.8
summary.lm(aov(y ~ AB, data=dd))
##
## Call:
## aov(formula = y ~ AB, data = dd)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.00 -1.05 -0.10 1.00 2.00
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 10.0000 0.6325 15.811 3.46e-11 ***
## AB[T.1] 0.4000 0.8944 0.447 0.661
## AB[T.2] 5.0000 0.8944 5.590 4.06e-05 ***
## AB[T.4] -4.8000 0.8944 -5.367 6.30e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.414 on 16 degrees of freedom
## Multiple R-squared: 0.8826, Adjusted R-squared: 0.8606
## F-statistic: 40.09 on 3 and 16 DF, p-value: 1.141e-07
# special contrasts work in a similar way, defining a contrasts property of a factor
# non orthogonal contrast
# both b1 (a1b1, a2b1) groups vs. group 4 (a2b2)
# both b1 (a1b1, a2b1) groups vs. group 3 (a1b2)
# both b1 (a1b1, a2b1) groups vs. both b2 groups (a1b2, a2b2)
contrasts(dd$AB) <- cbind(c(-1,0,-1,+2), c(-1,2,-2,0), c(-1,1,-1,1))
# check the design matrix
contrasts(dd$AB)
## [,1] [,2] [,3]
## a1b1 -1 -1 -1
## a1b2 0 2 1
## a2b1 -1 -2 -1
## a2b2 2 0 1
# check the dummy variables
cbind(dd, model.matrix(aov(y ~ AB, data=dd)) )
## subj A B AB y (Intercept) AB1 AB2 AB3
## 1 1 a1 b1 a1b1 10 1 -1 -1 -1
## 2 2 a1 b1 a1b1 11 1 -1 -1 -1
## 3 3 a1 b1 a1b1 9 1 -1 -1 -1
## 4 4 a1 b1 a1b1 12 1 -1 -1 -1
## 5 5 a1 b1 a1b1 10 1 -1 -1 -1
## 6 6 a1 b2 a1b2 15 1 0 2 1
## 7 7 a1 b2 a1b2 17 1 0 2 1
## 8 8 a1 b2 a1b2 13 1 0 2 1
## 9 9 a1 b2 a1b2 14 1 0 2 1
## 10 10 a1 b2 a1b2 16 1 0 2 1
## 11 11 a2 b1 a2b1 9 1 -1 -2 -1
## 12 12 a2 b1 a2b1 8 1 -1 -2 -1
## 13 13 a2 b1 a2b1 10 1 -1 -2 -1
## 14 14 a2 b1 a2b1 11 1 -1 -2 -1
## 15 15 a2 b1 a2b1 12 1 -1 -2 -1
## 16 16 a2 b2 a2b2 5 1 2 0 1
## 17 17 a2 b2 a2b2 7 1 2 0 1
## 18 18 a2 b2 a2b2 4 1 2 0 1
## 19 19 a2 b2 a2b2 6 1 2 0 1
## 20 20 a2 b2 a2b2 4 1 2 0 1
# contrasts specified express an effect contrast and coefficients express differences from grand mean
coefficients(aov(y ~ AB, data=dd))
## (Intercept) AB1 AB2 AB3
## 10.25 -4.50 0.40 3.95
summary.lm(aov(y ~ AB, data=dd))
##
## Call:
## aov(formula = y ~ AB, data = dd)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.00 -1.05 -0.10 1.00 2.00
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 10.2500 0.3873 26.465 1.23e-14 ***
## AB1 -4.5000 1.0000 -4.500 0.000364 ***
## AB2 0.4000 0.8944 0.447 0.660716
## AB3 3.9500 2.0857 1.894 0.076466 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.414 on 16 degrees of freedom
## Multiple R-squared: 0.8826, Adjusted R-squared: 0.8606
## F-statistic: 40.09 on 3 and 16 DF, p-value: 1.141e-07
# check reference contrasts and compare first group (a1b1) with the other ones, non orthogonal contrasts
contrasts(dd$AB) <- cbind(c(0,1,0,0), c(0,0,1,0), c(0,0,0,1))
# check the dummy variables
cbind(dd, model.matrix(aov(y ~ AB, data=dd)) )
## subj A B AB y (Intercept) AB1 AB2 AB3
## 1 1 a1 b1 a1b1 10 1 0 0 0
## 2 2 a1 b1 a1b1 11 1 0 0 0
## 3 3 a1 b1 a1b1 9 1 0 0 0
## 4 4 a1 b1 a1b1 12 1 0 0 0
## 5 5 a1 b1 a1b1 10 1 0 0 0
## 6 6 a1 b2 a1b2 15 1 1 0 0
## 7 7 a1 b2 a1b2 17 1 1 0 0
## 8 8 a1 b2 a1b2 13 1 1 0 0
## 9 9 a1 b2 a1b2 14 1 1 0 0
## 10 10 a1 b2 a1b2 16 1 1 0 0
## 11 11 a2 b1 a2b1 9 1 0 1 0
## 12 12 a2 b1 a2b1 8 1 0 1 0
## 13 13 a2 b1 a2b1 10 1 0 1 0
## 14 14 a2 b1 a2b1 11 1 0 1 0
## 15 15 a2 b1 a2b1 12 1 0 1 0
## 16 16 a2 b2 a2b2 5 1 0 0 1
## 17 17 a2 b2 a2b2 7 1 0 0 1
## 18 18 a2 b2 a2b2 4 1 0 0 1
## 19 19 a2 b2 a2b2 6 1 0 0 1
## 20 20 a2 b2 a2b2 4 1 0 0 1
# contrasts specified express an effect contrast and coefficients express differences from reference group (here a1b1)
coefficients(aov(y ~ AB, data=dd))
## (Intercept) AB1 AB2 AB3
## 10.4 4.6 -0.4 -5.2
summary.lm(aov(y ~ AB, data=dd))
##
## Call:
## aov(formula = y ~ AB, data = dd)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.00 -1.05 -0.10 1.00 2.00
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 10.4000 0.6325 16.444 1.91e-11 ***
## AB1 4.6000 0.8944 5.143 9.82e-05 ***
## AB2 -0.4000 0.8944 -0.447 0.661
## AB3 -5.2000 0.8944 -5.814 2.64e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.414 on 16 degrees of freedom
## Multiple R-squared: 0.8826, Adjusted R-squared: 0.8606
## F-statistic: 40.09 on 3 and 16 DF, p-value: 1.141e-07
We extend the sheet_anova a bit to understand.
subgroups defined by a factor variable ar ordered. contrast coefficients apply to the subgroups in this order. one contrast is one dummy variable. the dummy variables can be put into the regression formula to estimate the reaction variable for a specific subgroup (subgroup combination in factorial designs). a contrast vector applied to a factor variable using the command contrasts()
is similar to a column of model.matrix()
but ordered due to the factor levels.
Here we use a factor with 4 levels. ˆyi=b0+b1⋅D1+b2⋅D2+b3⋅D3+b4⋅D4
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
## ✔ tibble 3.1.6 ✔ dplyr 1.0.8
## ✔ tidyr 1.2.0 ✔ stringr 1.4.0
## ✔ readr 2.1.2 ✔ forcats 0.5.1
## ✔ purrr 0.3.4
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ psych::%+%() masks ggplot2::%+%()
## ✖ psych::alpha() masks ggplot2::alpha()
## ✖ dplyr::arrange() masks plyr::arrange()
## ✖ purrr::compact() masks plyr::compact()
## ✖ dplyr::count() masks plyr::count()
## ✖ dplyr::failwith() masks plyr::failwith()
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::id() masks plyr::id()
## ✖ dplyr::lag() masks stats::lag()
## ✖ dplyr::mutate() masks plyr::mutate()
## ✖ dplyr::recode() masks car::recode()
## ✖ dplyr::rename() masks plyr::rename()
## ✖ purrr::some() masks car::some()
## ✖ dplyr::summarise() masks plyr::summarise()
## ✖ dplyr::summarize() masks plyr::summarize()
dd <- read_delim("https://pzezula.pages.gwdg.de/data/Superhero.dat", delim = "\t")
## Rows: 30 Columns: 2
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: "\t"
## dbl (2): hero, injury
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# we define a new variable hero.f with factorized categorial levels and label them also
dd$hero.f <- factor(dd$hero,
levels = c(1:4), labels =
c("Spiderman", "Superman", "Hulk", "Ninja Turtle")
)
# all default
m.0 <- lm(injury ~ hero.f, dd)
summary.lm(m.0)
##
## Call:
## lm(formula = injury ~ hero.f, data = dd)
##
## Residuals:
## Min 1Q Median 3Q Max
## -28.333 -8.250 1.688 7.562 24.667
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 41.625 4.577 9.095 1.47e-09 ***
## hero.fSuperman 18.708 6.991 2.676 0.0127 *
## hero.fHulk -6.250 6.472 -0.966 0.3431
## hero.fNinja Turtle -15.375 6.472 -2.376 0.0252 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 12.94 on 26 degrees of freedom
## Multiple R-squared: 0.4897, Adjusted R-squared: 0.4308
## F-statistic: 8.317 on 3 and 26 DF, p-value: 0.0004828
unique(model.matrix(m.0))
## (Intercept) hero.fSuperman hero.fHulk hero.fNinja Turtle
## 1 1 0 0 0
## 9 1 1 0 0
## 15 1 0 1 0
## 23 1 0 0 1
mm <- m.0
g.l <- "Superman"
# mm["(Intercept)"] +
# the contrast of task 9 of anova sheet, only one contrast defined, the other two are set automatically and orthogonal to the first one
contrasts(dd$hero.f) <- c(1, -2, 1, 0)
m.1 <- lm(injury ~ hero.f, dd)
unique(model.matrix(m.1))
## (Intercept) hero.f1 hero.f2 hero.f3
## 1 1 1 -0.62659863 -0.4367007
## 9 1 -2 0.06329932 -0.2816497
## 15 1 1 0.75319726 -0.1265986
## 23 1 0 -0.18989795 0.8449490
# a contrast
contrasts(dd$hero.f) <- cbind(c(3,-1,-1,-1), c(0, 2,-1,-1), c(0, 0, 1, -1) )
m.2 <- lm(injury ~ hero.f, dd)
unique(model.matrix(m.2))
## (Intercept) hero.f1 hero.f2 hero.f3
## 1 1 3 0 0
## 9 1 -1 2 0
## 15 1 -1 -1 1
## 23 1 -1 -1 -1
# we double the coefficients
contrasts(dd$hero.f) <- cbind(c(6,-2,-2,-2), c(0, 4,-2,-2), c(0, 0, 2, -2) )
m.3 <- lm(injury ~ hero.f, dd)
unique(model.matrix(m.3))
## (Intercept) hero.f1 hero.f2 hero.f3
## 1 1 6 0 0
## 9 1 -2 4 0
## 15 1 -2 -2 2
## 23 1 -2 -2 -2
# simply doubling the contrast-coefficients divides the model coefficients by 2
m.2$coefficients
## (Intercept) hero.f1 hero.f2 hero.f3
## 40.8958333 0.2430556 9.8402778 4.5625000
m.3$coefficients
## (Intercept) hero.f1 hero.f2 hero.f3
## 40.8958333 0.1215278 4.9201389 2.2812500
# the significance tests stay the same
summary(m.2)
##
## Call:
## lm(formula = injury ~ hero.f, data = dd)
##
## Residuals:
## Min 1Q Median 3Q Max
## -28.333 -8.250 1.688 7.562 24.667
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 40.8958 2.3817 17.171 1.05e-15 ***
## hero.f1 0.2431 1.3394 0.181 0.857
## hero.f2 9.8403 2.0656 4.764 6.27e-05 ***
## hero.f3 4.5625 3.2361 1.410 0.170
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 12.94 on 26 degrees of freedom
## Multiple R-squared: 0.4897, Adjusted R-squared: 0.4308
## F-statistic: 8.317 on 3 and 26 DF, p-value: 0.0004828
summary(m.3)
##
## Call:
## lm(formula = injury ~ hero.f, data = dd)
##
## Residuals:
## Min 1Q Median 3Q Max
## -28.333 -8.250 1.688 7.562 24.667
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 40.8958 2.3817 17.171 1.05e-15 ***
## hero.f1 0.1215 0.6697 0.181 0.857
## hero.f2 4.9201 1.0328 4.764 6.27e-05 ***
## hero.f3 2.2812 1.6181 1.410 0.170
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 12.94 on 26 degrees of freedom
## Multiple R-squared: 0.4897, Adjusted R-squared: 0.4308
## F-statistic: 8.317 on 3 and 26 DF, p-value: 0.0004828
# a standard contrast of package car
contrasts(dd$hero.f) <- car::contr.Sum(levels(dd$hero.f))
m.4 <- lm(injury ~ hero.f, dd)
unique(model.matrix(m.4))
## (Intercept) hero.f[S.Spiderman] hero.f[S.Superman] hero.f[S.Hulk]
## 1 1 1 0 0
## 9 1 0 1 0
## 15 1 0 0 1
## 23 1 -1 -1 -1
# orthogonal contrast Helmert
contrasts(dd$hero.f) <- car::contr.Helmert(levels(dd$hero.f))
m.5 <- lm(injury ~ hero.f, dd)
unique(model.matrix(m.5))
## (Intercept) hero.f[H.1] hero.f[H.2] hero.f[H.3]
## 1 1 -1 -1 -1
## 9 1 1 -1 -1
## 15 1 0 2 -1
## 23 1 0 0 3
require(Rmisc)
Rmisc::summarySE(data=dd, "injury", groupvars="hero.f")
## hero.f N injury sd se ci
## 1 Spiderman 8 41.62500 12.211675 4.317479 10.209216
## 2 Superman 6 60.33333 17.851237 7.287737 18.733724
## 3 Hulk 8 35.37500 13.383759 4.731873 11.189102
## 4 Ninja Turtle 8 26.25000 8.154753 2.883141 6.817544
{}todo
Package library(memisc): Convenience Methods for Setting Contrasts
Erklärung und Verstehbeispiel zu contr.sum, contr.helmert, contr.treatment link
Field (2012) Kapitel comparing means
Sammlung R und Anova, Einstiegspunkte [https://www.r-statistics.com/tag/repeated-measures-anova/]
we can apply contrasts to factors
various standard contrasts are predefined
if contrasts are orthogonal, they account for independent parts of variance
we can also define and use our own contrasts
Version: 28 Mai, 2022 16:26