R linear.functional.terms GAM 中平滑的线性泛函

R语言 linear.functional.terms 位于 mgcv 包(package)。

说明

gam 允许响应变量依赖于平滑项的线性函数。具体形式的依赖关系

允许，其中 x_{ij} 是协变量值，L_{ij} 是固定权重。即响应可以取决于在不同协变量值下评估的相同平滑的加权和。例如，这允许响应取决于平滑的导数或积分(分别通过有限差分或求积来近似)。它还允许依赖于预测函数(有时称为“信号回归”)。

实现此目的的机制是向模型公式中 s 或 te 项指定的模型平滑项提供协变量值矩阵。协变量矩阵的每一列都会从平滑中产生相应的预测列。令评估的平滑值的结果矩阵为 F(F 将具有与协变量矩阵相同的维度)。如果没有 by 变量，则这些列将被简单地求和并添加到线性预测器中。即该项对线性预测器的贡献是 rowSums(F) 。如果存在 by 变量，则它必须是一个矩阵 L，例如，与 F(和协变量矩阵)具有相同维度，并且它包含上面给出的求和中的权重 L_{ij}。因此，在这种情况下，对线性预测器的贡献是 rowSums(L*F) 。

请注意，如果 {\bf L1} (即 rowSums(L) )是常量向量，或者没有 by 变量，则平滑将自动居中以确保可识别性。否则就不会了。另请注意，对于中心平滑，值得用 rowSums(L) 替换模型中的常数项，以确保预测自动按正确的比例进行。

predict.gam 可以接受矩阵预测器以此类项进行预测，在这种情况下，其 newdata 参数需要是一个列表。但是，从模型进行预测时，无需提供矩阵协变量和by 变量值。例如，为了简单地检查底层平滑函数，可以使用协变量值向量和向量 by 变量，其中 by 变量和上面的 L1 变量等价物设置为 1 的向量。

该机制可用于采用因子参数的随机效应平滑，通过使用技巧来创建因子的二维数组。只需创建一个因子向量，其中包含首尾相连的因子矩阵的列(列主序)。然后重置该向量的维度以创建适当的二维数组：第一个维度应该是响应数据的数量，第二个维度应该是所需因子矩阵的列数。您不能使用matrix 或data.matrix 来设置所需的因子水平矩阵。请参阅下面的示例。

例子

### matrix argument `linear operator' smoothing
library(mgcv)
set.seed(0)

###############################
## simple summation example...#
###############################

n<-400
sig<-2
x <- runif(n, 0, .9)
f2 <- function(x) 0.2*x^11*(10*(1-x))^6+10*(10*x)^3*(1-x)^10
x1 <- x + .1

f <- f2(x) + f2(x1)  ## response is sum of f at two adjacent x values 
y <- f + rnorm(n)*sig

X <- matrix(c(x,x1),n,2) ## matrix covariate contains both x values
b <- gam(y~s(X))         

plot(b)  ## reconstruction of f
plot(f,fitted(b))

## example of prediction with summation convention...
predict(b,list(X=X[1:3,]))

## example of prediction that simply evaluates smooth (no summation)...
predict(b,data.frame(X=c(.2,.3,.7))) 

######################################################################
## Simple random effect model example.
## model: y[i] = f(x[i]) + b[k[i]] - b[j[i]] + e[i]
## k[i] and j[i] index levels of i.i.d. random effects, b.
######################################################################

set.seed(7)
n <- 200
x <- runif(n) ## a continuous covariate

## set up a `factor matrix'...
fac <- factor(sample(letters,n*2,replace=TRUE))
dim(fac) <- c(n,2)

## simulate data from such a model...
nb <- length(levels(fac))
b <- rnorm(nb)
y <- 20*(x-.3)^4 + b[fac[,1]] - b[fac[,2]] + rnorm(n)*.5

L <- matrix(-1,n,2);L[,1] <- 1 ## the differencing 'by' variable 

mod <- gam(y ~ s(x) + s(fac,by=L,bs="re"),method="REML")
gam.vcomp(mod)
plot(mod,page=1)

## example of prediction using matrices...
dat <- list(L=L[1:20,],fac=fac[1:20,],x=x[1:20],y=y[1:20])
predict(mod,newdata=dat)


######################################################################
## multivariate integral example. Function `test1' will be integrated# 
## (by midpoint quadrature) over 100 equal area sub-squares covering # 
## the unit square. Noise is added to the resulting simulated data.  #
## `test1' is estimated from the resulting data using two alternative#
## smooths.                                                          #
######################################################################

test1 <- function(x,z,sx=0.3,sz=0.4)
  { (pi**sx*sz)*(1.2*exp(-(x-0.2)^2/sx^2-(z-0.3)^2/sz^2)+
    0.8*exp(-(x-0.7)^2/sx^2-(z-0.8)^2/sz^2))
  }

## create quadrature (integration) grid, in useful order
ig <- 5 ## integration grid within square
mx <- mz <- (1:ig-.5)/ig
ix <- rep(mx,ig);iz <- rep(mz,rep(ig,ig))

og <- 10 ## observarion grid
mx <- mz <- (1:og-1)/og
ox <- rep(mx,og);ox <- rep(ox,rep(ig^2,og^2))
oz <- rep(mz,rep(og,og));oz <- rep(oz,rep(ig^2,og^2))

x <- ox + ix/og;z <- oz + iz/og ## full grid, subsquare by subsquare

## create matrix covariates...
X <- matrix(x,og^2,ig^2,byrow=TRUE)
Z <- matrix(z,og^2,ig^2,byrow=TRUE)

## create simulated test data...
dA <- 1/(og*ig)^2  ## quadrature square area
F <- test1(X,Z)    ## evaluate on grid
f <- rowSums(F)*dA ## integrate by midpoint quadrature
y <- f + rnorm(og^2)*5e-4 ## add noise
## ... so each y is a noisy observation of the integral of `test1'
## over a 0.1 by 0.1 sub-square from the unit square

## Now fit model to simulated data...

L <- X*0 + dA

## ... let F be the matrix of the smooth evaluated at the x,z values
## in matrices X and Z. rowSums(L*F) gives the model predicted
## integrals of `test1' corresponding to the observed `y'

L1 <- rowSums(L) ## smooths are centred --- need to add in L%*%1

## fit models to reconstruct `test1'....

b <- gam(y~s(X,Z,by=L)+L1-1)   ## (L1 and const are confounded here)
b1 <- gam(y~te(X,Z,by=L)+L1-1) ## tensor product alternative

## plot results...

old.par<-par(mfrow=c(2,2))
x<-runif(n);z<-runif(n);
xs<-seq(0,1,length=30);zs<-seq(0,1,length=30)
pr<-data.frame(x=rep(xs,30),z=rep(zs,rep(30,30)))
truth<-matrix(test1(pr$x,pr$z),30,30)
contour(xs,zs,truth)
plot(b)
vis.gam(b,view=c("X","Z"),cond=list(L1=1,L=1),plot.type="contour")
vis.gam(b1,view=c("X","Z"),cond=list(L1=1,L=1),plot.type="contour")

####################################
## A "signal" regression example...#
####################################

rf <- function(x=seq(0,1,length=100)) {
## generates random functions...
  m <- ceiling(runif(1)*5) ## number of components
  f <- x*0;
  mu <- runif(m,min(x),max(x));sig <- (runif(m)+.5)*(max(x)-min(x))/10
  for (i in 1:m) f <- f+ dnorm(x,mu[i],sig[i])
  f
}

x <- seq(0,1,length=100) ## evaluation points

## example functional predictors...
par(mfrow=c(3,3));for (i in 1:9) plot(x,rf(x),type="l",xlab="x")

## simulate 200 functions and store in rows of L...
L <- matrix(NA,200,100) 
for (i in 1:200) L[i,] <- rf()  ## simulate the functional predictors

f2 <- function(x) { ## the coefficient function
  (0.2*x^11*(10*(1-x))^6+10*(10*x)^3*(1-x)^10)/10 
}

f <- f2(x) ## the true coefficient function

y <- L%*%f + rnorm(200)*20 ## simulated response data

## Now fit the model E(y) = L%*%f(x) where f is a smooth function.
## The summation convention is used to evaluate smooth at each value
## in matrix X to get matrix F, say. Then rowSum(L*F) gives E(y).

## create matrix of eval points for each function. Note that
## `smoothCon' is smart and will recognize the duplication...
X <- matrix(x,200,100,byrow=TRUE) 

b <- gam(y~s(X,by=L,k=20)) 
par(mfrow=c(1,1))
plot(b,shade=TRUE);lines(x,f,col=2)

作者

Simon N. Wood simon.wood@r-project.org

相关用法

注：本文由纯净天空筛选整理自R-devel大神的英文原创作品 Linear functionals of a smooth in GAMs。非经特殊声明，原始代码版权归原作者所有，本译文未经允许或授权，请勿转载或复制。