Es geht um das Phänomen, dass Daten unterschiedlicher Art und Verteilung dieselbe Korrelation produzieren können.
Sie finden die Daten zu Ascombe’s Quartet unter [http://md.psych.bio.uni-goettingen.de/mv/data/div/ascombe_quartet.txt]
Berechnen Sie die assen Sie die entsprechenden Modelle an und erstellen Sie aussagekräftig Grafiken, um das Phänomen zu demonstrieren.
[res_begin:res_aq]
Der Lösungsansatz berechnet die Korrelationskoeffizienten und erstellt Scattergramme mit der eingezeichneten Regressionsgeraden via ggplot().
# read Ascombe's quartet data
df.aq <- read.delim("http://md.psych.bio.uni-goettingen.de/mv/data/div/ascombe_quartet.txt")
require("psych")
## Loading required package: psych
psych:::describe(df.aq[grep("x",names(df.aq))])
## vars n mean sd median trimmed mad min max range skew kurtosis se
## x1 1 11 9 3.32 9 9 4.45 4 14 10 0.00 -1.53 1
## x2 2 11 9 3.32 9 9 4.45 4 14 10 0.00 -1.53 1
## x3 3 11 9 3.32 9 9 4.45 4 14 10 0.00 -1.53 1
## x4 4 11 9 3.32 8 8 0.00 8 19 11 2.47 4.52 1
psych:::describe(df.aq[grep("y",names(df.aq))])
## vars n mean sd median trimmed mad min max range skew kurtosis se
## y1 1 11 7.5 2.03 7.58 7.49 1.82 4.26 10.84 6.58 -0.05 -1.20 0.61
## y2 2 11 7.5 2.03 8.14 7.79 1.47 3.10 9.26 6.16 -0.98 -0.51 0.61
## y3 3 11 7.5 2.03 7.11 7.15 1.53 5.39 12.74 7.35 1.38 1.24 0.61
## y4 4 11 7.5 2.03 7.04 7.20 1.90 5.25 12.50 7.25 1.12 0.63 0.61
# correlations
message(paste('x1*y1: ', round(cor(df.aq['x1'], df.aq['y1'] ), digits=3)))
## x1*y1: 0.816
message(paste('x2*y2: ', round(cor(df.aq['x2'], df.aq['y2'] ), digits=3)))
## x2*y2: 0.816
message(paste('x3*y3: ', round(cor(df.aq['x3'], df.aq['y3'] ), digits=3)))
## x3*y3: 0.816
message(paste('x4*y4: ', round(cor(df.aq['x4'], df.aq['y4'] ), digits=3)))
## x4*y4: 0.817
# compare a lm
lm(x1~y1, data=df.aq)$coefficients
## (Intercept) y1
## -0.9975311 1.3328426
lm(x2~y2, data=df.aq)$coefficients
## (Intercept) y2
## -0.9948419 1.3324841
lm(x3~y3, data=df.aq)$coefficients
## (Intercept) y3
## -1.000315 1.333375
lm(x4~y4, data=df.aq)$coefficients
## (Intercept) y4
## -1.003640 1.333657
require("ggplot2")
## Loading required package: ggplot2
##
## Attaching package: 'ggplot2'
## The following objects are masked from 'package:psych':
##
## %+%, alpha
# create base plot object
pplot <- ggplot(df.aq, aes(x=x1, y=y1))
# plot the 4 it include global regression line
ggplot(df.aq, aes(x=x1, y=y1)) + geom_point(size=4) + stat_smooth(method = "lm", se=FALSE)
## `geom_smooth()` using formula 'y ~ x'
ggplot(df.aq, aes(x=x2, y=y2)) + geom_point(size=4) + stat_smooth(method = "lm", se=FALSE)
## `geom_smooth()` using formula 'y ~ x'
ggplot(df.aq, aes(x=x3, y=y3)) + geom_point(size=4) + stat_smooth(method = "lm", se=FALSE)
## `geom_smooth()` using formula 'y ~ x'
ggplot(df.aq, aes(x=x4, y=y4)) + geom_point(size=4) + stat_smooth(method = "lm", se=FALSE)
## `geom_smooth()` using formula 'y ~ x'
[res_end]