R nls 非线性最小二乘法

R语言 nls 位于 stats 包(package)。

说明

确定非线性模型参数的非线性(加权)最小二乘估计。

用法

nls(formula, data, start, control, algorithm,
    trace, subset, weights, na.action, model,
    lower, upper, ...)

参数

`formula`	非线性模型formula，包括变量和参数。如有必要，将强制采用公式。
`data`	一个可选数据帧，用于评估 `formula` 和 `weights` 中的变量。也可以是列表或环境，但不能是矩阵。
`start`	起始估计的命名列表或命名数值向量。当 `start` 丢失时(并且 `formula` 不是 self-starting 模型，请参阅 `selfStart` )，会尝试对 `start` 进行非常简单的猜测(如果 `algorithm != "plinear"` )。
`control`	控制设置的可选`list`。有关可设置控制值的名称及其效果，请参阅`nls.control`。
`algorithm`	指定要使用的算法的字符串。默认算法是Gauss-Newton 算法。其他可能的值是 `"plinear"`(用于部分线性最小二乘模型的 Golub-Pereyra 算法)和 `"port"`(来自 Port 库的 ‘nl2sol’ 算法) - 请参阅引用。可以缩写。
`trace`	指示是否应打印迭代进度跟踪的逻辑值。默认为 `FALSE` 。如果`TRUE` 为残差(加权)平方和，则收敛标准和参数值将在每次迭代结束时打印。请注意，使用了`format()`，因此这些主要取决于`getOption("digits")`。当使用`"plinear"`算法时，线性参数的条件估计将打印在非线性参数之后。当使用 `"port"` 算法时，打印的目标函数值是残差(加权)平方和的一半。
`subset`	一个可选向量，指定要在拟合过程中使用的观测子集。
`weights`	(固定)权重的可选数值向量。如果存在，目标函数是加权最小二乘法。
`na.action`	一个函数，指示当数据包含 `NA` 时应该发生什么。默认值由 `options` 的 `na.action` 设置设置，如果未设置，则为 `na.fail`。 ‘factory-fresh’默认为`na.omit`。值 `na.exclude` 可能很有用。
`model`	合乎逻辑的。如果为 true，则模型框架将作为对象的一部分返回。默认为 `FALSE` 。
`lower` , `upper`	下限和上限向量，复制为与 `start` 一样长。如果未指定，则假定所有参数均不受约束。界限只能与 `"port"` 算法一起使用。如果针对其他算法给出它们，它们将被忽略，并带有警告。
`...`	附加可选参数。目前没有使用。

细节

nls 对象是一种拟合模型对象。它具有通用函数 anova , coef , confint , deviance , df.residual , fitted , formula , logLik , predict , print , profile , residuals , summary , vcov 和 weights 的方法。

首先在 data 中查找 formula 中的变量(如果没有丢失，则还包括 weights )，然后在 formula 的环境中查找，最后沿着搜索路径查找。首先在formula环境中搜索formula中的函数，然后沿着搜索路径搜索。

仅当从 data 获取的公式中的所有变量具有相同长度时，才支持参数 subset 和 na.action：其他情况会发出警告。

请注意，anova 方法不会检查模型是否嵌套：这不能轻易自动完成，因此请小心使用。

值

的列表

`m`	包含模型的 `nlsModel` 对象。
`data`	作为数据参数传递给 `nls` 的表达式。实际数据值存在于 `m` 组件的 `environment` 中，例如 `environment(m$conv)` 。
`call`	与多个组件的匹配调用，特别是 `algorithm` 。
`na.action`	模型框架的`"na.action"` 属性(如果有)。
`dataClasses`	模型框架的 `"terms"` 属性的 `"dataClasses"` 属性(如果有)。
`model`	如果是 `model = TRUE` ，则为模型框架。
`weights`	如果提供`weights`，则为权重。
`convInfo`	包含收敛信息的列表。
`control`	使用的控件`list`，请参阅`control` 参数。
`convergence` , `message`	仅适用于 `algorithm = "port"` 拟合、收敛代码(用于收敛的 `0`)和消息。不推荐使用它们，因为它们现在可以从 `convInfo` 中获得。

警告

nls 的默认设置通常会因人为的 “zero-residual” 数据问题而失败。

nls 函数使用 relative-offset 收敛标准，该标准将当前参数估计值的数值不精确性与残差平方和进行比较。这对表单数据表现良好

(与 var(\varepsilon) > 0 )。它无法表明形式数据的收敛性

因为该标准相当于比较舍入误差的两个组成部分。为了避免在计算收敛测试值时出现zero-divide，应在分母平方和中添加一个正常数scaleOffset；它在 control 中设置，如下例所示；这还不适用于 algorithm = "port" 。

algorithm = "port" 代码似乎未完成，甚至不检查起始值是否在范围内。请谨慎使用，尤其是在提供边界的情况下。

注意

环境warnOnly = TRUE在里面control论证(见nls.control)返回一个非收敛对象(因为R版本 2.5.0)可能对进一步的收敛分析有用，但不是供推论.

例子


require(graphics)

DNase1 <- subset(DNase, Run == 1)

## using a selfStart model
fm1DNase1 <- nls(density ~ SSlogis(log(conc), Asym, xmid, scal), DNase1)
summary(fm1DNase1)
## the coefficients only:
coef(fm1DNase1)
## including their SE, etc:
coef(summary(fm1DNase1))

## using conditional linearity
fm2DNase1 <- nls(density ~ 1/(1 + exp((xmid - log(conc))/scal)),
                 data = DNase1,
                 start = list(xmid = 0, scal = 1),
                 algorithm = "plinear")
summary(fm2DNase1)

## without conditional linearity
fm3DNase1 <- nls(density ~ Asym/(1 + exp((xmid - log(conc))/scal)),
                 data = DNase1,
                 start = list(Asym = 3, xmid = 0, scal = 1))
summary(fm3DNase1)

## using Port's nl2sol algorithm
fm4DNase1 <- nls(density ~ Asym/(1 + exp((xmid - log(conc))/scal)),
                 data = DNase1,
                 start = list(Asym = 3, xmid = 0, scal = 1),
                 algorithm = "port")
summary(fm4DNase1)

## weighted nonlinear regression
Treated <- Puromycin[Puromycin$state == "treated", ]
weighted.MM <- function(resp, conc, Vm, K)
{
    ## Purpose: exactly as white book p. 451 -- RHS for nls()
    ##  Weighted version of Michaelis-Menten model
    ## ----------------------------------------------------------
    ## Arguments: 'y', 'x' and the two parameters (see book)
    ## ----------------------------------------------------------
    ## Author: Martin Maechler, Date: 23 Mar 2001

    pred <- (Vm * conc)/(K + conc)
    (resp - pred) / sqrt(pred)
}

Pur.wt <- nls( ~ weighted.MM(rate, conc, Vm, K), data = Treated,
              start = list(Vm = 200, K = 0.1))
summary(Pur.wt)

## Passing arguments using a list that can not be coerced to a data.frame
lisTreat <- with(Treated,
                 list(conc1 = conc[1], conc.1 = conc[-1], rate = rate))

weighted.MM1 <- function(resp, conc1, conc.1, Vm, K)
{
     conc <- c(conc1, conc.1)
     pred <- (Vm * conc)/(K + conc)
    (resp - pred) / sqrt(pred)
}
Pur.wt1 <- nls( ~ weighted.MM1(rate, conc1, conc.1, Vm, K),
               data = lisTreat, start = list(Vm = 200, K = 0.1))
stopifnot(all.equal(coef(Pur.wt), coef(Pur.wt1)))

## Chambers and Hastie (1992) Statistical Models in S  (p. 537):
## If the value of the right side [of formula] has an attribute called
## 'gradient' this should be a matrix with the number of rows equal
## to the length of the response and one column for each parameter.

weighted.MM.grad <- function(resp, conc1, conc.1, Vm, K)
{
  conc <- c(conc1, conc.1)

  K.conc <- K+conc
  dy.dV <- conc/K.conc
  dy.dK <- -Vm*dy.dV/K.conc
  pred <- Vm*dy.dV
  pred.5 <- sqrt(pred)
  dev <- (resp - pred) / pred.5
  Ddev <- -0.5*(resp+pred)/(pred.5*pred)
  attr(dev, "gradient") <- Ddev * cbind(Vm = dy.dV, K = dy.dK)
  dev
}

Pur.wt.grad <- nls( ~ weighted.MM.grad(rate, conc1, conc.1, Vm, K),
                   data = lisTreat, start = list(Vm = 200, K = 0.1))

rbind(coef(Pur.wt), coef(Pur.wt1), coef(Pur.wt.grad))

## In this example, there seems no advantage to providing the gradient.
## In other cases, there might be.


## The two examples below show that you can fit a model to
## artificial data with noise but not to artificial data
## without noise.
x <- 1:10
y <- 2*x + 3                            # perfect fit
## terminates in an error, because convergence cannot be confirmed:
try(nls(y ~ a + b*x, start = list(a = 0.12345, b = 0.54321)))
## adjusting the convergence test by adding 'scaleOffset' to its denominator RSS:
nls(y ~ a + b*x, start = list(a = 0.12345, b = 0.54321),
    control = list(scaleOffset = 1, printEval=TRUE))
## Alternatively jittering the "too exact" values, slightly:
set.seed(27)
yeps <- y + rnorm(length(y), sd = 0.01) # added noise
nls(yeps ~ a + b*x, start = list(a = 0.12345, b = 0.54321))


## the nls() internal cheap guess for starting values can be sufficient:
x <- -(1:100)/10
y <- 100 + 10 * exp(x / 2) + rnorm(x)/10
nlmod <- nls(y ~  Const + A * exp(B * x))

plot(x,y, main = "nls(*), data, true function and fit, n=100")
curve(100 + 10 * exp(x / 2), col = 4, add = TRUE)
lines(x, predict(nlmod), col = 2)

## Here, requiring close convergence, must use more accurate numerical differentiation,
## as this typically gives Error: "step factor .. reduced below 'minFactor' .."
## IGNORE_RDIFF_BEGIN
try(nlm1 <- update(nlmod, control = list(tol = 1e-7)))
o2 <- options(digits = 10) # more accuracy for 'trace'
## central differencing works here typically (PR#18165: not converging on *some*):
ctr2 <- nls.control(nDcentral=TRUE, tol = 8e-8, # <- even smaller than above
   warnOnly =
        TRUE || # << work around; e.g. needed on some ATLAS-Lapack setups
        (grepl("^aarch64.*linux", R.version$platform) && grepl("^NixOS", osVersion)
              ))
(nlm2 <- update(nlmod, control = ctr2, trace = TRUE)); options(o2)
## --> convergence tolerance  4.997e-8 (in 11 iter.)
## IGNORE_RDIFF_END

## The muscle dataset in MASS is from an experiment on muscle
## contraction on 21 animals.  The observed variables are Strip
## (identifier of muscle), Conc (Cacl concentration) and Length
## (resulting length of muscle section).
## IGNORE_RDIFF_BEGIN
if(requireNamespace("MASS", quietly = TRUE)) withAutoprint({
## The non linear model considered is
##       Length = alpha + beta*exp(-Conc/theta) + error
## where theta is constant but alpha and beta may vary with Strip.

with(MASS::muscle, table(Strip)) # 2, 3 or 4 obs per strip

## We first use the plinear algorithm to fit an overall model,
## ignoring that alpha and beta might vary with Strip.
musc.1 <- nls(Length ~ cbind(1, exp(-Conc/th)), MASS::muscle,
              start = list(th = 1), algorithm = "plinear")
summary(musc.1)

## Then we use nls' indexing feature for parameters in non-linear
## models to use the conventional algorithm to fit a model in which
## alpha and beta vary with Strip.  The starting values are provided
## by the previously fitted model.
## Note that with indexed parameters, the starting values must be
## given in a list (with names):
b <- coef(musc.1)
musc.2 <- nls(Length ~ a[Strip] + b[Strip]*exp(-Conc/th), MASS::muscle,
              start = list(a = rep(b[2], 21), b = rep(b[3], 21), th = b[1]))
summary(musc.2)
})
## IGNORE_RDIFF_END

作者

Douglas M. Bates and Saikat DebRoy: David M. Gay for the Fortran code used by algorithm = "port".

参考

Bates, D. M. and Watts, D. G. (1988) Nonlinear Regression Analysis and Its Applications, Wiley

Bates, D. M. and Chambers, J. M. (1992) Nonlinear models. Chapter 10 of Statistical Models in S eds J. M. Chambers and T. J. Hastie, Wadsworth & Brooks/Cole.

https://netlib.org/port/ for the Port library documentation.

也可以看看

summary.nls、predict.nls、profile.nls。

自启动模型(具有“自动初始值”)：selfStart 。

相关用法

注：本文由纯净天空筛选整理自R-devel大神的英文原创作品 Nonlinear Least Squares。非经特殊声明，原始代码版权归原作者所有，本译文未经允许或授权，请勿转载或复制。