R anscombe Anscombe 的“相同”簡單線性回歸四重奏

R語言 anscombe 位於 datasets 包(package)。

說明

四個 x - y 數據集具有相同的傳統統計屬性(均值、方差、相關性、回歸線等)，但又截然不同。

用法

anscombe

格式

包含 8 個變量 11 個觀測值的 DataFrame 。

x1 == x2 == x3	整數4:14，特別排列
x4	值 8 和 19
y1、y2、y3、y4	(3, 12.5) 中的數字，平均值為 7.5，標準差為 2.03

例子

require(stats); require(graphics)
summary(anscombe)

##-- now some "magic" to do the 4 regressions in a loop:
ff <- y ~ x
mods <- setNames(as.list(1:4), paste0("lm", 1:4))
for(i in 1:4) {
  ff[2:3] <- lapply(paste0(c("y","x"), i), as.name)
  ## or   ff[[2]] <- as.name(paste0("y", i))
  ##      ff[[3]] <- as.name(paste0("x", i))
  mods[[i]] <- lmi <- lm(ff, data = anscombe)
  print(anova(lmi))
}

## See how close they are (numerically!)
sapply(mods, coef)
lapply(mods, function(fm) coef(summary(fm)))

## Now, do what you should have done in the first place: PLOTS
op <- par(mfrow = c(2, 2), mar = 0.1+c(4,4,1,1), oma =  c(0, 0, 2, 0))
for(i in 1:4) {
  ff[2:3] <- lapply(paste0(c("y","x"), i), as.name)
  plot(ff, data = anscombe, col = "red", pch = 21, bg = "orange", cex = 1.2,
       xlim = c(3, 19), ylim = c(3, 13))
  abline(mods[[i]], col = "blue")
}
mtext("Anscombe's 4 Regression data sets", outer = TRUE, cex = 1.5)
par(op)

來源

愛德華·R·塔夫特 (1989)。定量信息的視覺顯示，13-14。圖形出版社。

參考

Anscombe, Francis J. (1973). Graphs in statistical analysis. The American Statistician, 27, 17-21. doi:10.2307/2682899.

相關用法

注：本文由純淨天空篩選整理自R-devel大神的英文原創作品 Anscombe's Quartet of ‘Identical’ Simple Linear Regressions。非經特殊聲明，原始代碼版權歸原作者所有，本譯文未經允許或授權，請勿轉載或複製。