R ginla GAM 集成嵌套拉普拉斯逼近牛頓增強

R語言 ginla 位於 mgcv 包(package)。

說明

使用 Wood (2019) 中說明的 INLA 變體，將集成嵌套拉普拉斯逼近 (INLA, Rue et al. 2009) 應用於可通過 gam 或 bam 估計的模型。為每個係數、選定係數或係數向量的線性變換生成邊際後驗密度。

用法

ginla(G,A=NULL,nk=16,nb=100,J=1,interactive=FALSE,int=0,approx=0)

參數

`G`	預擬合 gam 對象，由 `gam(...,fit=FALSE)` 或 `bam(...,discrete=TRUE,fit=FALSE)` 生成。
`A`	感興趣的係數的變換矩陣，或者感興趣的參數的索引數組。如果`NULL`，則為所有係數生成分布。
`nk`	用於評估其對數邊後驗密度的每個係數的值的數量。然後對這些點進行樣條插值。
`nb`	評估係數後驗密度以作為網格函數返回的點數。
`J`	在對數行列式近似步驟中要采取多少個行列式更新步驟。不建議增加此值。
`interactive`	如果這是 `>0` 或 `TRUE`，則每個近似後驗都以紅色繪製，覆蓋在簡單的高斯近似後驗上。如果`2`則等待用戶在每個繪圖之間按回車鍵。對於判斷使用 INLA 方法是否有所收獲很有用。
`int`	0 跳過積分並僅使用後驗模態平滑參數。 >0 用於使用 Rue 等人提出的 CCD 方法進行積分。 (2009)。
`approx`	0 表示完全近似； 1 更新Hessian，但使用近似模式； 2 作為 1 並假設常數 Hessian。查看具體信息。

細節

讓\beta,\theta和y分別表示模型係數、超參數/平滑參數和響應數據。原則上，INLA 采用拉普拉斯近似\pi(\beta_i|\theta,y)和\pi(\theta|y)然後得到邊後驗分布\pi(\beta_i|y)通過將近似值積分\pi(\beta_i|\theta,y)\pi(\theta|y)相對於\theta(也可以生成超參數的邊際)。在實踐中，拉普拉斯近似為\pi(\beta_i|\theta,y)計算每個的成本太高\beta_i並且本身必須是近似的。為此，必須計算兩個量：後驗模式\beta^*|\beta_i以及聯合對數密度的 Hessian 矩陣的行列式\log \pi(\beta,\theta,y)w.r.t.\beta在模式下。魯等人。 (2009)最初通過對後驗的簡單高斯近似所隱含的條件模式來近似後驗條件模式\pi(\beta|y)。然後，他們將 Hessian 矩陣的對數行列式近似為以下函數：\beta_i使用一階泰勒展開式，對於他們使用的稀疏模型表示來說，計算起來很便宜，但在使用密集的低秩基礎展開式時則不然gam。他們還提供了一種更昂貴的替代近似，基於計算僅與以下元素相關的對數行列式：\beta具有足夠高的相關性\beta_i根據簡單的高斯後驗近似：效率似乎再次取決於稀疏性。 Wood (2018) 建議精確計算所需的後驗模式，並將對數行列式近似基於無條件模型下 Hessian 的 BFGS 更新。無論有或沒有稀疏性，後者都是有效的，而前者是“免費”的改進。這兩個步驟都很有效，因為獲得 Cholesky 因子的成本很低H[-i,-i]從那H- 看choldrop。這是該例程所采用的方法。

approx 參數允許兩個進一步的近似來加速計算。對於approx==1，不使用精確的後驗條件模式，而是使用簡單高斯後驗近似隱含的條件模式。對於approx==2，模態使用相同的近似值，並且假定 Hessian 矩陣為常數。後者非常快，因為不需要對數聯合密度梯度評估。

請注意，對於許多模型，INLA 估計值非常接近通常的後驗高斯近似，interactive 參數對於研究此問題非常有用。

bam 模型僅支持 disrete=TRUE 選項。 discrete=FALSE 方法效率太低。不支持 AR1 模型(相關參數將被忽略)。

值

包含元素 beta 和 density 的列表，這兩個元素都是矩陣。每一行與一個感興趣的係數(或線性係數組合)相關。兩個矩陣都有nb 列。如果int!=0，則另一個元素reml給出CCD積分中使用的積分權重，首先給出中心點權重。

警告

此例程仍處於實驗階段，因此細節可能會發生變化。此外，目前並非所有步驟都具有最佳效率。

該例程是為相對專業的用戶編寫的。

ginla 並非旨在處理排名不足的模型。

例子

  require(mgcv); require(MASS)

  ## example using a scale location model for the motorcycle data. A simple plotting
  ## routine is produced first...

  plot.inla <- function(x,inla,k=1,levels=c(.025,.1,.5,.9,.975),
               lcol = c(2,4,4,4,2),lwd = c(1,1,2,1,1),lty=c(1,1,1,1,1),
	       xlab="x",ylab="y",cex.lab=1.5) {
    ## a simple effect plotter, when distributions of function values of
    ## 1D smooths have been computed
    require(splines)
    p <- length(x) 
    betaq <- matrix(0,length(levels),p) ## storage for beta quantiles 
    for (i in 1:p) { ## work through x and betas
      j <- i + k - 1
      p <- cumsum(inla$density[j,])*(inla$beta[j,2]-inla$beta[j,1])
      ## getting quantiles of function values...
      betaq[,i] <- approx(p,y=inla$beta[j,],levels)$y
    }
    xg <- seq(min(x),max(x),length=200)
    ylim <- range(betaq)
    ylim <- 1.1*(ylim-mean(ylim))+mean(ylim)
    for (j in 1:length(levels)) { ## plot the quantiles
      din <- interpSpline(x,betaq[j,])
      if (j==1) {
        plot(xg,predict(din,xg)$y,ylim=ylim,type="l",col=lcol[j],
             xlab=xlab,ylab=ylab,lwd=lwd[j],cex.lab=1.5,lty=lty[j])
      } else lines(xg,predict(din,xg)$y,col=lcol[j],lwd=lwd[j],lty=lty[j])
    }
  } ## plot.inla

  ## set up the model with a `gam' call...

  G <- gam(list(accel~s(times,k=20,bs="ad"),~s(times)),
            data=mcycle,family=gaulss(),fit=FALSE)
  b <- gam(G=G,method="REML") ## regular GAM fit for comparison

  ## Now use ginla to get posteriors of estimated effect values
  ## at evenly spaced times. Create A matrix for this...
  
  rat <- range(mcycle$times)
  pd0 <- data.frame(times=seq(rat[1],rat[2],length=20))
  X0 <- predict(b,newdata=pd0,type="lpmatrix")
  X0[,21:30] <- 0
  pd1 <- data.frame(times=seq(rat[1],rat[2],length=10))
  X1 <- predict(b,newdata=pd1,type="lpmatrix")
  X1[,1:20] <- 0
  A <- rbind(X0,X1) ## A maps coefs to required function values

  ## call ginla. Set int to 1 for integrated version.
  ## Set interactive = 1 or 2 to plot marginal posterior distributions
  ## (red) and simple Gaussian approximation (black).
 
  inla <- ginla(G,A,int=0)

  par(mfrow=c(1,2),mar=c(5,5,1,1))
  fv <- predict(b,se=TRUE) ## usual Gaussian approximation, for comparison

  ## plot inla mean smooth effect...
  plot.inla(pd0$times,inla,k=1,xlab="time",ylab=expression(f[1](time))) 

  ## overlay simple Gaussian equivalent (in grey) ...
  points(mcycle$times,mcycle$accel,col="grey")
  lines(mcycle$times,fv$fit[,1],col="grey",lwd=2)
  lines(mcycle$times,fv$fit[,1]+2*fv$se.fit[,1],lty=2,col="grey",lwd=2)
  lines(mcycle$times,fv$fit[,1]-2*fv$se.fit[,1],lty=2,col="grey",lwd=2)

  ## same for log sd smooth...
  plot.inla(pd1$times,inla,k=21,xlab="time",ylab=expression(f[2](time)))
  lines(mcycle$times,fv$fit[,2],col="grey",lwd=2)
  lines(mcycle$times,fv$fit[,2]+2*fv$se.fit[,2],col="grey",lty=2,lwd=2)
  lines(mcycle$times,fv$fit[,2]-2*fv$se.fit[,2],col="grey",lty=2,lwd=2)

  ## ... notice some real differences for the log sd smooth, especially
  ## at the lower and upper ends of the time interval.

作者

Simon N. Wood simon.wood@r-project.org

參考

Rue, H, Martino, S. & Chopin, N. (2009) Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations (with discussion). Journal of the Royal Statistical Society, Series B. 71: 319-392.

Wood (2019) Simplified Integrated Laplace Approximation. In press Biometrika.

相關用法

注：本文由純淨天空篩選整理自R-devel大神的英文原創作品 GAM Integrated Nested Laplace Approximation Newton Enhanced。非經特殊聲明，原始代碼版權歸原作者所有，本譯文未經允許或授權，請勿轉載或複製。