R pcls 懲罰約束最小二乘擬合

R語言 pcls 位於 mgcv 包(package)。

說明

使用二次規劃求解受線性等式和不等式約束的二次懲罰的最小二乘問題。

用法

pcls(M)

參數

M

是 pcls 的單個列表參數。它應該具有以下要素：

y: 響應數據向量。
w: 數據權重向量(通常與方差的倒數成正比)。
X: 問題的設計矩陣，注意ncol(M$X)必須給出模型參數的數量，而nrow(M$X)應該給出數據的數量。
C: 包含問題的任何線性等式約束的矩陣(例如 {\bf Cp}={\bf c} 中的 \bf C )。如果沒有等式約束，則將其初始化為零乘零矩陣。請注意，無需提供向量 \bf c ，它是由初始參數估計 \bf p 隱式定義的。
S: 懲罰矩陣列表。 S[[i]] 是包含第 i 個懲罰矩陣的所有非零元素的最小連續矩陣。它懲罰的第一個參數由 off[i]+1 給出(從 1 開始計數)。
離開: 將 M$S 的元素定位在每個懲罰係數矩陣內的正確位置的偏移值。 (零偏移意味著從第一個位置開始)
sp: 平滑參數估計值的數組。
p: 一係列可行的初始參數估計 - 這些必須滿足約束，但應避免將不等式約束滿足為等式約束。
艾因: 不等式約束的矩陣 {\bf A}_{in} {\bf p} > {\bf b}_{in} 。
垃圾桶: 不等式約束中的向量。

細節

這解決了這個問題：

受約束 {\bf Cp}={\bf c} 和 {\bf A}_{in}{\bf p}>{\bf b}_{in} ，w.r.t. \bf p 給定平滑參數 \lambda_i 。 {\bf X} 是設計矩陣，\bf p 參數向量，\bf y 數據向量，\bf W 對角權重矩陣， {\bf S}_i 定義第 i 個懲罰的正半定係數矩陣，\bf C a定義問題的線性等式約束的係數矩陣。平滑參數是 \lambda_i 。請注意， {\bf X} 必須具有完整的列等級，至少在投影到任何等式約束的空空間時是如此。 {\bf A}_{in} 是定義不等式約束的係數矩陣，而{\bf b}_{in} 是定義不等式約束所涉及的向量。

二次規劃用於執行求解。所使用的方法旨在實現最小二乘問題的最大穩定性：即 {\bf X}^\prime {\bf X} 未明確形成。參見吉爾等人。 1981年。

值

該函數返回一個包含估計參數向量的數組。

例子

require(mgcv)
# first an un-penalized example - fit E(y)=a+bx subject to a>0
set.seed(0)
n <- 100
x <- runif(n); y <- x - 0.2 + rnorm(n)*0.1
M <- list(X=matrix(0,n,2),p=c(0.1,0.5),off=array(0,0),S=list(),
Ain=matrix(0,1,2),bin=0,C=matrix(0,0,0),sp=array(0,0),y=y,w=y*0+1)
M$X[,1] <- 1; M$X[,2] <- x; M$Ain[1,] <- c(1,0)
pcls(M) -> M$p
plot(x,y); abline(M$p,col=2); abline(coef(lm(y~x)),col=3)

# Penalized example: monotonic penalized regression spline .....

# Generate data from a monotonic truth.
x <- runif(100)*4-1;x <- sort(x);
f <- exp(4*x)/(1+exp(4*x)); y <- f+rnorm(100)*0.1; plot(x,y)
dat <- data.frame(x=x,y=y)
# Show regular spline fit (and save fitted object)
f.ug <- gam(y~s(x,k=10,bs="cr")); lines(x,fitted(f.ug))
# Create Design matrix, constraints etc. for monotonic spline....
sm <- smoothCon(s(x,k=10,bs="cr"),dat,knots=NULL)[[1]]
F <- mono.con(sm$xp);   # get constraints
G <- list(X=sm$X,C=matrix(0,0,0),sp=f.ug$sp,p=sm$xp,y=y,w=y*0+1)
G$Ain <- F$A;G$bin <- F$b;G$S <- sm$S;G$off <- 0

p <- pcls(G);  # fit spline (using s.p. from unconstrained fit)

fv<-Predict.matrix(sm,data.frame(x=x))%*%p
lines(x,fv,col=2)

# now a tprs example of the same thing....

f.ug <- gam(y~s(x,k=10)); lines(x,fitted(f.ug))
# Create Design matrix, constriants etc. for monotonic spline....
sm <- smoothCon(s(x,k=10,bs="tp"),dat,knots=NULL)[[1]]
xc <- 0:39/39 # points on [0,1]  
nc <- length(xc)  # number of constraints
xc <- xc*4-1  # points at which to impose constraints
A0 <- Predict.matrix(sm,data.frame(x=xc)) 
# ... A0%*%p evaluates spline at xc points
A1 <- Predict.matrix(sm,data.frame(x=xc+1e-6)) 
A <- (A1-A0)/1e-6    
##  ... approx. constraint matrix (A%*%p is -ve 
## spline gradient at points xc)
G <- list(X=sm$X,C=matrix(0,0,0),sp=f.ug$sp,y=y,w=y*0+1,S=sm$S,off=0)
G$Ain <- A;    # constraint matrix
G$bin <- rep(0,nc);  # constraint vector
G$p <- rep(0,10); G$p[10] <- 0.1  
# ... monotonic start params, got by setting coefs of polynomial part
p <- pcls(G);  # fit spline (using s.p. from unconstrained fit)

fv2 <- Predict.matrix(sm,data.frame(x=x))%*%p
lines(x,fv2,col=3)

######################################
## monotonic additive model example...
######################################

## First simulate data...

set.seed(10)
f1 <- function(x) 5*exp(4*x)/(1+exp(4*x));
f2 <- function(x) {
  ind <- x > .5
  f <- x*0
  f[ind] <- (x[ind] - .5)^2*10
  f 
}
f3 <- function(x) 0.2 * x^11 * (10 * (1 - x))^6 + 
      10 * (10 * x)^3 * (1 - x)^10
n <- 200
x <- runif(n); z <- runif(n); v <- runif(n)
mu <- f1(x) + f2(z) + f3(v)
y <- mu + rnorm(n)

## Preliminary unconstrained gam fit...
G <- gam(y~s(x)+s(z)+s(v,k=20),fit=FALSE)
b <- gam(G=G)

## generate constraints, by finite differencing
## using predict.gam ....
eps <- 1e-7
pd0 <- data.frame(x=seq(0,1,length=100),z=rep(.5,100),
                  v=rep(.5,100))
pd1 <- data.frame(x=seq(0,1,length=100)+eps,z=rep(.5,100),
                  v=rep(.5,100))
X0 <- predict(b,newdata=pd0,type="lpmatrix")
X1 <- predict(b,newdata=pd1,type="lpmatrix")
Xx <- (X1 - X0)/eps ## Xx %*% coef(b) must be positive 
pd0 <- data.frame(z=seq(0,1,length=100),x=rep(.5,100),
                  v=rep(.5,100))
pd1 <- data.frame(z=seq(0,1,length=100)+eps,x=rep(.5,100),
                  v=rep(.5,100))
X0 <- predict(b,newdata=pd0,type="lpmatrix")
X1 <- predict(b,newdata=pd1,type="lpmatrix")
Xz <- (X1-X0)/eps
G$Ain <- rbind(Xx,Xz) ## inequality constraint matrix
G$bin <- rep(0,nrow(G$Ain))
G$C = matrix(0,0,ncol(G$X))
G$sp <- b$sp
G$p <- coef(b)
G$off <- G$off-1 ## to match what pcls is expecting
## force inital parameters to meet constraint
G$p[11:18] <- G$p[2:9]<- 0
p <- pcls(G) ## constrained fit
par(mfrow=c(2,3))
plot(b) ## original fit
b$coefficients <- p
plot(b) ## constrained fit
## note that standard errors in preceding plot are obtained from
## unconstrained fit

作者

Simon N. Wood simon.wood@r-project.org

參考

Gill, P.E., Murray, W. and Wright, M.H. (1981) Practical Optimization. Academic Press, London.

Wood, S.N. (1994) Monotonic smoothing splines fitted by cross validation SIAM Journal on Scientific Computing 15(5):1126-1133

https://www.maths.ed.ac.uk/~swood34/

也可以看看

magic , mono.con

相關用法

注：本文由純淨天空篩選整理自R-devel大神的英文原創作品 Penalized Constrained Least Squares Fitting。非經特殊聲明，原始代碼版權歸原作者所有，本譯文未經允許或授權，請勿轉載或複製。