R pcls 惩罚约束最小二乘拟合

R语言 pcls 位于 mgcv 包(package)。

说明

使用二次规划求解受线性等式和不等式约束的二次惩罚的最小二乘问题。

用法

pcls(M)

参数

M

是 pcls 的单个列表参数。它应该具有以下要素：

y: 响应数据向量。
w: 数据权重向量(通常与方差的倒数成正比)。
X: 问题的设计矩阵，注意ncol(M$X)必须给出模型参数的数量，而nrow(M$X)应该给出数据的数量。
C: 包含问题的任何线性等式约束的矩阵(例如 {\bf Cp}={\bf c} 中的 \bf C )。如果没有等式约束，则将其初始化为零乘零矩阵。请注意，无需提供向量 \bf c ，它是由初始参数估计 \bf p 隐式定义的。
S: 惩罚矩阵列表。 S[[i]] 是包含第 i 个惩罚矩阵的所有非零元素的最小连续矩阵。它惩罚的第一个参数由 off[i]+1 给出(从 1 开始计数)。
离开: 将 M$S 的元素定位在每个惩罚系数矩阵内的正确位置的偏移值。 (零偏移意味着从第一个位置开始)
sp: 平滑参数估计值的数组。
p: 一系列可行的初始参数估计 - 这些必须满足约束，但应避免将不等式约束满足为等式约束。
艾因: 不等式约束的矩阵 {\bf A}_{in} {\bf p} > {\bf b}_{in} 。
垃圾桶: 不等式约束中的向量。

细节

这解决了这个问题：

受约束 {\bf Cp}={\bf c} 和 {\bf A}_{in}{\bf p}>{\bf b}_{in} ，w.r.t. \bf p 给定平滑参数 \lambda_i 。 {\bf X} 是设计矩阵，\bf p 参数向量，\bf y 数据向量，\bf W 对角权重矩阵， {\bf S}_i 定义第 i 个惩罚的正半定系数矩阵，\bf C a定义问题的线性等式约束的系数矩阵。平滑参数是 \lambda_i 。请注意， {\bf X} 必须具有完整的列等级，至少在投影到任何等式约束的空空间时是如此。 {\bf A}_{in} 是定义不等式约束的系数矩阵，而{\bf b}_{in} 是定义不等式约束所涉及的向量。

二次规划用于执行求解。所使用的方法旨在实现最小二乘问题的最大稳定性：即 {\bf X}^\prime {\bf X} 未明确形成。参见吉尔等人。 1981年。

值

该函数返回一个包含估计参数向量的数组。

例子

require(mgcv)
# first an un-penalized example - fit E(y)=a+bx subject to a>0
set.seed(0)
n <- 100
x <- runif(n); y <- x - 0.2 + rnorm(n)*0.1
M <- list(X=matrix(0,n,2),p=c(0.1,0.5),off=array(0,0),S=list(),
Ain=matrix(0,1,2),bin=0,C=matrix(0,0,0),sp=array(0,0),y=y,w=y*0+1)
M$X[,1] <- 1; M$X[,2] <- x; M$Ain[1,] <- c(1,0)
pcls(M) -> M$p
plot(x,y); abline(M$p,col=2); abline(coef(lm(y~x)),col=3)

# Penalized example: monotonic penalized regression spline .....

# Generate data from a monotonic truth.
x <- runif(100)*4-1;x <- sort(x);
f <- exp(4*x)/(1+exp(4*x)); y <- f+rnorm(100)*0.1; plot(x,y)
dat <- data.frame(x=x,y=y)
# Show regular spline fit (and save fitted object)
f.ug <- gam(y~s(x,k=10,bs="cr")); lines(x,fitted(f.ug))
# Create Design matrix, constraints etc. for monotonic spline....
sm <- smoothCon(s(x,k=10,bs="cr"),dat,knots=NULL)[[1]]
F <- mono.con(sm$xp);   # get constraints
G <- list(X=sm$X,C=matrix(0,0,0),sp=f.ug$sp,p=sm$xp,y=y,w=y*0+1)
G$Ain <- F$A;G$bin <- F$b;G$S <- sm$S;G$off <- 0

p <- pcls(G);  # fit spline (using s.p. from unconstrained fit)

fv<-Predict.matrix(sm,data.frame(x=x))%*%p
lines(x,fv,col=2)

# now a tprs example of the same thing....

f.ug <- gam(y~s(x,k=10)); lines(x,fitted(f.ug))
# Create Design matrix, constriants etc. for monotonic spline....
sm <- smoothCon(s(x,k=10,bs="tp"),dat,knots=NULL)[[1]]
xc <- 0:39/39 # points on [0,1]  
nc <- length(xc)  # number of constraints
xc <- xc*4-1  # points at which to impose constraints
A0 <- Predict.matrix(sm,data.frame(x=xc)) 
# ... A0%*%p evaluates spline at xc points
A1 <- Predict.matrix(sm,data.frame(x=xc+1e-6)) 
A <- (A1-A0)/1e-6    
##  ... approx. constraint matrix (A%*%p is -ve 
## spline gradient at points xc)
G <- list(X=sm$X,C=matrix(0,0,0),sp=f.ug$sp,y=y,w=y*0+1,S=sm$S,off=0)
G$Ain <- A;    # constraint matrix
G$bin <- rep(0,nc);  # constraint vector
G$p <- rep(0,10); G$p[10] <- 0.1  
# ... monotonic start params, got by setting coefs of polynomial part
p <- pcls(G);  # fit spline (using s.p. from unconstrained fit)

fv2 <- Predict.matrix(sm,data.frame(x=x))%*%p
lines(x,fv2,col=3)

######################################
## monotonic additive model example...
######################################

## First simulate data...

set.seed(10)
f1 <- function(x) 5*exp(4*x)/(1+exp(4*x));
f2 <- function(x) {
  ind <- x > .5
  f <- x*0
  f[ind] <- (x[ind] - .5)^2*10
  f 
}
f3 <- function(x) 0.2 * x^11 * (10 * (1 - x))^6 + 
      10 * (10 * x)^3 * (1 - x)^10
n <- 200
x <- runif(n); z <- runif(n); v <- runif(n)
mu <- f1(x) + f2(z) + f3(v)
y <- mu + rnorm(n)

## Preliminary unconstrained gam fit...
G <- gam(y~s(x)+s(z)+s(v,k=20),fit=FALSE)
b <- gam(G=G)

## generate constraints, by finite differencing
## using predict.gam ....
eps <- 1e-7
pd0 <- data.frame(x=seq(0,1,length=100),z=rep(.5,100),
                  v=rep(.5,100))
pd1 <- data.frame(x=seq(0,1,length=100)+eps,z=rep(.5,100),
                  v=rep(.5,100))
X0 <- predict(b,newdata=pd0,type="lpmatrix")
X1 <- predict(b,newdata=pd1,type="lpmatrix")
Xx <- (X1 - X0)/eps ## Xx %*% coef(b) must be positive 
pd0 <- data.frame(z=seq(0,1,length=100),x=rep(.5,100),
                  v=rep(.5,100))
pd1 <- data.frame(z=seq(0,1,length=100)+eps,x=rep(.5,100),
                  v=rep(.5,100))
X0 <- predict(b,newdata=pd0,type="lpmatrix")
X1 <- predict(b,newdata=pd1,type="lpmatrix")
Xz <- (X1-X0)/eps
G$Ain <- rbind(Xx,Xz) ## inequality constraint matrix
G$bin <- rep(0,nrow(G$Ain))
G$C = matrix(0,0,ncol(G$X))
G$sp <- b$sp
G$p <- coef(b)
G$off <- G$off-1 ## to match what pcls is expecting
## force inital parameters to meet constraint
G$p[11:18] <- G$p[2:9]<- 0
p <- pcls(G) ## constrained fit
par(mfrow=c(2,3))
plot(b) ## original fit
b$coefficients <- p
plot(b) ## constrained fit
## note that standard errors in preceding plot are obtained from
## unconstrained fit

作者

Simon N. Wood simon.wood@r-project.org

参考

Gill, P.E., Murray, W. and Wright, M.H. (1981) Practical Optimization. Academic Press, London.

Wood, S.N. (1994) Monotonic smoothing splines fitted by cross validation SIAM Journal on Scientific Computing 15(5):1126-1133

https://www.maths.ed.ac.uk/~swood34/

也可以看看

magic , mono.con

相关用法

注：本文由纯净天空筛选整理自R-devel大神的英文原创作品 Penalized Constrained Least Squares Fitting。非经特殊声明，原始代码版权归原作者所有，本译文未经允许或授权，请勿转载或复制。