Python mxnet.ndarray.sparse.adagrad_update用法及代碼示例

用法:
mxnet.ndarray.sparse.adagrad_update(weight=None, grad=None, history=None, lr=_Null, epsilon=_Null, wd=_Null, rescale_grad=_Null, clip_gradient=_Null, out=None, name=None, **kwargs)

參數：

weight：(NDArray) - 重量
grad：(NDArray) - 坡度
history：(NDArray) - 曆史
lr：(float, required) - 學習率
epsilon：(float, optional, default=1.00000001e-07) - ε
wd：(float, optional, default=0) - 重量衰減
rescale_grad：(float, optional, default=1) - 將漸變重新縮放為 grad = rescale_grad*grad。
clip_gradient：(float, optional, default=-1) - 將漸變剪裁到 [-clip_gradient, clip_gradient] 的範圍內如果clip_gradient <= 0，漸變剪裁被關閉。畢業 = 最大(最小(畢業，clip_gradient)，-clip_gradient)。
out：(NDArray, optional) - 輸出 NDArray 來保存結果。

out：- 此函數的輸出。

返回類型：

NDArray 或 NDArray 列表

AdaGrad 優化器的更新函數。

引用自Adaptive Subgradient Methods for Online Learning and Stochastic Optimization，並在http://www.jmlr.org/papers/volume12/duchi11a/duchi11a.pdf.

更新適用於：

rescaled_grad = clip(grad * rescale_grad, clip_gradient)
history = history + square(rescaled_grad)
w = w - learning_rate * rescaled_grad / sqrt(history + epsilon)

請注意，不支持權重衰減選項的非零值。

相關用法

注：本文由純淨天空篩選整理自apache.org大神的英文原創作品 mxnet.ndarray.sparse.adagrad_update。非經特殊聲明，原始代碼版權歸原作者所有，本譯文未經允許或授權，請勿轉載或複製。

用法:

參數：

返回：

返回類型：