Python tf.gradients用法及代碼示例

構造 ys w.r.t 之和的符號導數。 x 在 xs 。

用法

tf.gradients(
    ys, xs, grad_ys=None, name='gradients', gate_gradients=False,
    aggregation_method=None, stop_gradients=None,
    unconnected_gradients=tf.UnconnectedGradients.NONE
)

參數

ys Tensor 或要區分的張量列表。
xs Tensor 或用於微分的張量列表。
grad_ys 可選的。 Tensor 或與 ys 大小相同的張量列表，並保存為 ys 中的每個 y 計算的梯度。
name 用於將所有漸變操作分組在一起的可選名稱。默認為'gradients'。
gate_gradients 如果為 True，則在為操作返回的梯度周圍添加一個元組。這避免了一些競爭條件。
aggregation_method 指定用於組合梯度項的方法。接受的值是在類 AggregationMethod 中定義的常量。
stop_gradients 可選的。 Tensor 或不區分的張量列表。
unconnected_gradients 可選的。指定給定輸入張量未連接時返回的梯度值。接受的值是在類 tf.UnconnectedGradients 中定義的常量，默認值為 none 。

長度為 len(xs) 的 Tensor 列表，其中每個張量是 sum(dy/dx) ：對於 ys 中的 y 和 xs 中的 x。

拋出

LookupError 如果 x 和 y 之間的操作之一沒有注冊的梯度函數。
ValueError 如果參數無效。
RuntimeError 如果在 Eager 模式下調用。

tf.gradients 僅在圖形上下文中有效。特別是，它在 tf.function 包裝器的上下文中有效，其中代碼作為圖形執行。

ys 和 xs 每個都是 Tensor 或張量列表。 grad_ys 是 Tensor 的列表，包含 ys 接收到的梯度。該列表的長度必須與 ys 相同。

gradients() 向圖中添加操作以輸出 ys 相對於 xs 的導數。它返回長度為 len(xs) 的 Tensor 列表，其中每個張量是 sum(dy/dx) ：對於 ys 中的 y 和 xs 中的 x。

grad_ys 是與 ys 長度相同的張量列表，其中包含 ys 中每個 y 的初始梯度。當 grad_ys 為 None 時，我們為 ys 中的每個 y 填充一個形狀為 y 的 '1's 張量。用戶可以提供他們自己的初始grad_ys，以使用每個 y 的不同初始梯度來計算導數(例如，如果想要對每個 y 中的每個值以不同的方式加權梯度)。

stop_gradients 是一個 Tensor 或相對於所有 xs 被視為常數的張量列表。這些張量不會被反向傳播，就好像它們已使用 stop_gradient 顯式斷開連接一樣。除其他外，這允許計算偏導數而不是全導數。例如：

@tf.function
def example():
  a = tf.constant(0.)
  b = 2 * a
  return tf.gradients(a + b, [a, b], stop_gradients=[a, b])
example()
[<tf.Tensor:shape=(), dtype=float32, numpy=1.0>,
<tf.Tensor:shape=(), dtype=float32, numpy=1.0>]

這裏的偏導數 g 計算為 [1.0, 1.0] ，而總導數 tf.gradients(a + b, [a, b]) 考慮到 a 對 b 的影響並計算為 [3.0, 1.0] 。請注意，以上等價於：

@tf.function
def example():
  a = tf.stop_gradient(tf.constant(0.))
  b = tf.stop_gradient(2 * a)
  return tf.gradients(a + b, [a, b])
example()
[<tf.Tensor:shape=(), dtype=float32, numpy=1.0>,
<tf.Tensor:shape=(), dtype=float32, numpy=1.0>]

stop_gradients 提供了一種在圖已經構建後停止梯度的方法，與在圖構建期間使用的tf.stop_gradient 相比。當這兩種方法結合使用時，反向傳播會在 tf.stop_gradient 節點和 stop_gradients 中的節點處停止，以先遇到者為準。

對於所有 xs ，所有整數張量都被認為是常數，就好像它們包含在 stop_gradients 中一樣。

unconnected_gradients 確定 xs 中每個 x 的返回值，如果它在圖中未連接到 ys。默認情況下，這是 None 以防止錯誤。從數學上講，這些梯度為零，可以使用 'zero' 選項請求。 tf.UnconnectedGradients 提供以下選項和行為：

@tf.function
def example(use_zero):
  a = tf.ones([1, 2])
  b = tf.ones([3, 1])
  if use_zero:
    return tf.gradients([b], [a], unconnected_gradients='zero')
  else:
    return tf.gradients([b], [a], unconnected_gradients='none')
example(False)
[None]
example(True)
[<tf.Tensor:shape=(1, 2), dtype=float32, numpy=array([[0., 0.]], ...)>]

讓我們舉一個在反向傳播階段出現的實際例子。此函數用於評估成本函數關於權重 Ws 和偏差 bs 的導數。下麵的示例實現提供了它實際用途的說明：

@tf.function
def example():
  Ws = tf.constant(0.)
  bs = 2 * Ws
  cost = Ws + bs  # This is just an example. Please ignore the formulas.
  g = tf.gradients(cost, [Ws, bs])
  dCost_dW, dCost_db = g
  return dCost_dW, dCost_db
example()
(<tf.Tensor:shape=(), dtype=float32, numpy=3.0>,
<tf.Tensor:shape=(), dtype=float32, numpy=1.0>)

相關用法

注：本文由純淨天空篩選整理自tensorflow.org大神的英文原創作品 tf.gradients。非經特殊聲明，原始代碼版權歸原作者所有，本譯文未經允許或授權，請勿轉載或複製。