Python mxnet.executor.Executor.backward用法及代碼示例

用法: backward(out_grads=None, is_train=True)

參數：

out_grads：(NDArray or list of NDArray or dict of str to NDArray, optional) - 要傳播返回的輸出的梯度。僅當在不是損失函數的輸出上調用 bind 時才需要此參數。
is_train：(bool, default True) - 這是用於訓練還是推理。請注意，在極少數情況下，您希望使用 is_train=False 向後調用以在推理期間獲得梯度。

向後傳遞以獲得參數的梯度。

例子：

>>> # Example for binding on loss function symbol, which gives the loss value of the model.
>>> # Equivalently it gives the head gradient for backward pass.
>>> # In this example the built-in SoftmaxOutput is used as loss function.
>>> # MakeLoss can be used to define customized loss function symbol.
>>> net = mx.sym.Variable('data')
>>> net = mx.sym.FullyConnected(net, name='fc', num_hidden=6)
>>> net = mx.sym.Activation(net, name='relu', act_type="relu")
>>> net = mx.sym.SoftmaxOutput(net, name='softmax')

>>> args =  {'data': mx.nd.ones((1, 4)), 'fc_weight': mx.nd.ones((6, 4)),
>>>          'fc_bias': mx.nd.array((1, 4, 4, 4, 5, 6)), 'softmax_label': mx.nd.ones((1))}
>>> args_grad = {'fc_weight': mx.nd.zeros((6, 4)), 'fc_bias': mx.nd.zeros((6))}
>>> texec = net.bind(ctx=mx.cpu(), args=args, args_grad=args_grad)
>>> out = texec.forward(is_train=True)[0].copy()
>>> print out.asnumpy()
[[ 0.00378404  0.07600445  0.07600445  0.07600445  0.20660152  0.5616011 ]]
>>> texec.backward()
>>> print(texec.grad_arrays[1].asnumpy())
[[ 0.00378404  0.00378404  0.00378404  0.00378404]
 [-0.92399555 -0.92399555 -0.92399555 -0.92399555]
 [ 0.07600445  0.07600445  0.07600445  0.07600445]
 [ 0.07600445  0.07600445  0.07600445  0.07600445]
 [ 0.20660152  0.20660152  0.20660152  0.20660152]
 [ 0.5616011   0.5616011   0.5616011   0.5616011 ]]
>>>
>>> # Example for binding on non-loss function symbol.
>>> # Here the binding symbol is neither built-in loss function
>>> # nor customized loss created by MakeLoss.
>>> # As a result the head gradient is not automatically provided.
>>> a = mx.sym.Variable('a')
>>> b = mx.sym.Variable('b')
>>> # c is not a loss function symbol
>>> c = 2 * a + b
>>> args = {'a': mx.nd.array([1,2]), 'b':mx.nd.array([2,3])}
>>> args_grad = {'a': mx.nd.zeros((2)), 'b': mx.nd.zeros((2))}
>>> texec = c.bind(ctx=mx.cpu(), args=args, args_grad=args_grad)
>>> out = texec.forward(is_train=True)[0].copy()
>>> print(out.asnumpy())
[ 4.  7.]
>>> # out_grads is the head gradient in backward pass.
>>> # Here we define 'c' as loss function.
>>> # Then 'out' is passed as head gradient of backward pass.
>>> texec.backward(out)
>>> print(texec.grad_arrays[0].asnumpy())
[ 8.  14.]
>>> print(texec.grad_arrays[1].asnumpy())
[ 4.  7.]

相關用法

注：本文由純淨天空篩選整理自apache.org大神的英文原創作品 mxnet.executor.Executor.backward。非經特殊聲明，原始代碼版權歸原作者所有，本譯文未經允許或授權，請勿轉載或複製。