R nls 非線性最小二乘法

R語言 nls 位於 stats 包(package)。

說明

確定非線性模型參數的非線性(加權)最小二乘估計。

用法

nls(formula, data, start, control, algorithm,
    trace, subset, weights, na.action, model,
    lower, upper, ...)

參數

`formula`	非線性模型formula，包括變量和參數。如有必要，將強製采用公式。
`data`	一個可選數據幀，用於評估 `formula` 和 `weights` 中的變量。也可以是列表或環境，但不能是矩陣。
`start`	起始估計的命名列表或命名數值向量。當 `start` 丟失時(並且 `formula` 不是 self-starting 模型，請參閱 `selfStart` )，會嘗試對 `start` 進行非常簡單的猜測(如果 `algorithm != "plinear"` )。
`control`	控製設置的可選`list`。有關可設置控製值的名稱及其效果，請參閱`nls.control`。
`algorithm`	指定要使用的算法的字符串。默認算法是Gauss-Newton 算法。其他可能的值是 `"plinear"`(用於部分線性最小二乘模型的 Golub-Pereyra 算法)和 `"port"`(來自 Port 庫的 ‘nl2sol’ 算法) - 請參閱引用。可以縮寫。
`trace`	指示是否應打印迭代進度跟蹤的邏輯值。默認為 `FALSE` 。如果`TRUE` 為殘差(加權)平方和，則收斂標準和參數值將在每次迭代結束時打印。請注意，使用了`format()`，因此這些主要取決於`getOption("digits")`。當使用`"plinear"`算法時，線性參數的條件估計將打印在非線性參數之後。當使用 `"port"` 算法時，打印的目標函數值是殘差(加權)平方和的一半。
`subset`	一個可選向量，指定要在擬合過程中使用的觀測子集。
`weights`	(固定)權重的可選數值向量。如果存在，目標函數是加權最小二乘法。
`na.action`	一個函數，指示當數據包含 `NA` 時應該發生什麽。默認值由 `options` 的 `na.action` 設置設置，如果未設置，則為 `na.fail`。 ‘factory-fresh’默認為`na.omit`。值 `na.exclude` 可能很有用。
`model`	合乎邏輯的。如果為 true，則模型框架將作為對象的一部分返回。默認為 `FALSE` 。
`lower` , `upper`	下限和上限向量，複製為與 `start` 一樣長。如果未指定，則假定所有參數均不受約束。界限隻能與 `"port"` 算法一起使用。如果針對其他算法給出它們，它們將被忽略，並帶有警告。
`...`	附加可選參數。目前沒有使用。

細節

nls 對象是一種擬合模型對象。它具有通用函數 anova , coef , confint , deviance , df.residual , fitted , formula , logLik , predict , print , profile , residuals , summary , vcov 和 weights 的方法。

首先在 data 中查找 formula 中的變量(如果沒有丟失，則還包括 weights )，然後在 formula 的環境中查找，最後沿著搜索路徑查找。首先在formula環境中搜索formula中的函數，然後沿著搜索路徑搜索。

僅當從 data 獲取的公式中的所有變量具有相同長度時，才支持參數 subset 和 na.action：其他情況會發出警告。

請注意，anova 方法不會檢查模型是否嵌套：這不能輕易自動完成，因此請小心使用。

值

的列表

`m`	包含模型的 `nlsModel` 對象。
`data`	作為數據參數傳遞給 `nls` 的表達式。實際數據值存在於 `m` 組件的 `environment` 中，例如 `environment(m$conv)` 。
`call`	與多個組件的匹配調用，特別是 `algorithm` 。
`na.action`	模型框架的`"na.action"` 屬性(如果有)。
`dataClasses`	模型框架的 `"terms"` 屬性的 `"dataClasses"` 屬性(如果有)。
`model`	如果是 `model = TRUE` ，則為模型框架。
`weights`	如果提供`weights`，則為權重。
`convInfo`	包含收斂信息的列表。
`control`	使用的控件`list`，請參閱`control` 參數。
`convergence` , `message`	僅適用於 `algorithm = "port"` 擬合、收斂代碼(用於收斂的 `0`)和消息。不推薦使用它們，因為它們現在可以從 `convInfo` 中獲得。

警告

nls 的默認設置通常會因人為的 “zero-residual” 數據問題而失敗。

nls 函數使用 relative-offset 收斂標準，該標準將當前參數估計值的數值不精確性與殘差平方和進行比較。這對表單數據表現良好

(與 var(\varepsilon) > 0 )。它無法表明形式數據的收斂性

因為該標準相當於比較舍入誤差的兩個組成部分。為了避免在計算收斂測試值時出現zero-divide，應在分母平方和中添加一個正常數scaleOffset；它在 control 中設置，如下例所示；這還不適用於 algorithm = "port" 。

algorithm = "port" 代碼似乎未完成，甚至不檢查起始值是否在範圍內。請謹慎使用，尤其是在提供邊界的情況下。

注意

環境warnOnly = TRUE在裏麵control論證(見nls.control)返回一個非收斂對象(因為R版本 2.5.0)可能對進一步的收斂分析有用，但不是供推論.

例子


require(graphics)

DNase1 <- subset(DNase, Run == 1)

## using a selfStart model
fm1DNase1 <- nls(density ~ SSlogis(log(conc), Asym, xmid, scal), DNase1)
summary(fm1DNase1)
## the coefficients only:
coef(fm1DNase1)
## including their SE, etc:
coef(summary(fm1DNase1))

## using conditional linearity
fm2DNase1 <- nls(density ~ 1/(1 + exp((xmid - log(conc))/scal)),
                 data = DNase1,
                 start = list(xmid = 0, scal = 1),
                 algorithm = "plinear")
summary(fm2DNase1)

## without conditional linearity
fm3DNase1 <- nls(density ~ Asym/(1 + exp((xmid - log(conc))/scal)),
                 data = DNase1,
                 start = list(Asym = 3, xmid = 0, scal = 1))
summary(fm3DNase1)

## using Port's nl2sol algorithm
fm4DNase1 <- nls(density ~ Asym/(1 + exp((xmid - log(conc))/scal)),
                 data = DNase1,
                 start = list(Asym = 3, xmid = 0, scal = 1),
                 algorithm = "port")
summary(fm4DNase1)

## weighted nonlinear regression
Treated <- Puromycin[Puromycin$state == "treated", ]
weighted.MM <- function(resp, conc, Vm, K)
{
    ## Purpose: exactly as white book p. 451 -- RHS for nls()
    ##  Weighted version of Michaelis-Menten model
    ## ----------------------------------------------------------
    ## Arguments: 'y', 'x' and the two parameters (see book)
    ## ----------------------------------------------------------
    ## Author: Martin Maechler, Date: 23 Mar 2001

    pred <- (Vm * conc)/(K + conc)
    (resp - pred) / sqrt(pred)
}

Pur.wt <- nls( ~ weighted.MM(rate, conc, Vm, K), data = Treated,
              start = list(Vm = 200, K = 0.1))
summary(Pur.wt)

## Passing arguments using a list that can not be coerced to a data.frame
lisTreat <- with(Treated,
                 list(conc1 = conc[1], conc.1 = conc[-1], rate = rate))

weighted.MM1 <- function(resp, conc1, conc.1, Vm, K)
{
     conc <- c(conc1, conc.1)
     pred <- (Vm * conc)/(K + conc)
    (resp - pred) / sqrt(pred)
}
Pur.wt1 <- nls( ~ weighted.MM1(rate, conc1, conc.1, Vm, K),
               data = lisTreat, start = list(Vm = 200, K = 0.1))
stopifnot(all.equal(coef(Pur.wt), coef(Pur.wt1)))

## Chambers and Hastie (1992) Statistical Models in S  (p. 537):
## If the value of the right side [of formula] has an attribute called
## 'gradient' this should be a matrix with the number of rows equal
## to the length of the response and one column for each parameter.

weighted.MM.grad <- function(resp, conc1, conc.1, Vm, K)
{
  conc <- c(conc1, conc.1)

  K.conc <- K+conc
  dy.dV <- conc/K.conc
  dy.dK <- -Vm*dy.dV/K.conc
  pred <- Vm*dy.dV
  pred.5 <- sqrt(pred)
  dev <- (resp - pred) / pred.5
  Ddev <- -0.5*(resp+pred)/(pred.5*pred)
  attr(dev, "gradient") <- Ddev * cbind(Vm = dy.dV, K = dy.dK)
  dev
}

Pur.wt.grad <- nls( ~ weighted.MM.grad(rate, conc1, conc.1, Vm, K),
                   data = lisTreat, start = list(Vm = 200, K = 0.1))

rbind(coef(Pur.wt), coef(Pur.wt1), coef(Pur.wt.grad))

## In this example, there seems no advantage to providing the gradient.
## In other cases, there might be.


## The two examples below show that you can fit a model to
## artificial data with noise but not to artificial data
## without noise.
x <- 1:10
y <- 2*x + 3                            # perfect fit
## terminates in an error, because convergence cannot be confirmed:
try(nls(y ~ a + b*x, start = list(a = 0.12345, b = 0.54321)))
## adjusting the convergence test by adding 'scaleOffset' to its denominator RSS:
nls(y ~ a + b*x, start = list(a = 0.12345, b = 0.54321),
    control = list(scaleOffset = 1, printEval=TRUE))
## Alternatively jittering the "too exact" values, slightly:
set.seed(27)
yeps <- y + rnorm(length(y), sd = 0.01) # added noise
nls(yeps ~ a + b*x, start = list(a = 0.12345, b = 0.54321))


## the nls() internal cheap guess for starting values can be sufficient:
x <- -(1:100)/10
y <- 100 + 10 * exp(x / 2) + rnorm(x)/10
nlmod <- nls(y ~  Const + A * exp(B * x))

plot(x,y, main = "nls(*), data, true function and fit, n=100")
curve(100 + 10 * exp(x / 2), col = 4, add = TRUE)
lines(x, predict(nlmod), col = 2)

## Here, requiring close convergence, must use more accurate numerical differentiation,
## as this typically gives Error: "step factor .. reduced below 'minFactor' .."
## IGNORE_RDIFF_BEGIN
try(nlm1 <- update(nlmod, control = list(tol = 1e-7)))
o2 <- options(digits = 10) # more accuracy for 'trace'
## central differencing works here typically (PR#18165: not converging on *some*):
ctr2 <- nls.control(nDcentral=TRUE, tol = 8e-8, # <- even smaller than above
   warnOnly =
        TRUE || # << work around; e.g. needed on some ATLAS-Lapack setups
        (grepl("^aarch64.*linux", R.version$platform) && grepl("^NixOS", osVersion)
              ))
(nlm2 <- update(nlmod, control = ctr2, trace = TRUE)); options(o2)
## --> convergence tolerance  4.997e-8 (in 11 iter.)
## IGNORE_RDIFF_END

## The muscle dataset in MASS is from an experiment on muscle
## contraction on 21 animals.  The observed variables are Strip
## (identifier of muscle), Conc (Cacl concentration) and Length
## (resulting length of muscle section).
## IGNORE_RDIFF_BEGIN
if(requireNamespace("MASS", quietly = TRUE)) withAutoprint({
## The non linear model considered is
##       Length = alpha + beta*exp(-Conc/theta) + error
## where theta is constant but alpha and beta may vary with Strip.

with(MASS::muscle, table(Strip)) # 2, 3 or 4 obs per strip

## We first use the plinear algorithm to fit an overall model,
## ignoring that alpha and beta might vary with Strip.
musc.1 <- nls(Length ~ cbind(1, exp(-Conc/th)), MASS::muscle,
              start = list(th = 1), algorithm = "plinear")
summary(musc.1)

## Then we use nls' indexing feature for parameters in non-linear
## models to use the conventional algorithm to fit a model in which
## alpha and beta vary with Strip.  The starting values are provided
## by the previously fitted model.
## Note that with indexed parameters, the starting values must be
## given in a list (with names):
b <- coef(musc.1)
musc.2 <- nls(Length ~ a[Strip] + b[Strip]*exp(-Conc/th), MASS::muscle,
              start = list(a = rep(b[2], 21), b = rep(b[3], 21), th = b[1]))
summary(musc.2)
})
## IGNORE_RDIFF_END

作者

Douglas M. Bates and Saikat DebRoy: David M. Gay for the Fortran code used by algorithm = "port".

參考

Bates, D. M. and Watts, D. G. (1988) Nonlinear Regression Analysis and Its Applications, Wiley

Bates, D. M. and Chambers, J. M. (1992) Nonlinear models. Chapter 10 of Statistical Models in S eds J. M. Chambers and T. J. Hastie, Wadsworth & Brooks/Cole.

https://netlib.org/port/ for the Port library documentation.

也可以看看

summary.nls、predict.nls、profile.nls。

自啟動模型(具有“自動初始值”)：selfStart 。

相關用法

注：本文由純淨天空篩選整理自R-devel大神的英文原創作品 Nonlinear Least Squares。非經特殊聲明，原始代碼版權歸原作者所有，本譯文未經允許或授權，請勿轉載或複製。