Python tf.raw_ops.GRUBlockCellGrad用法及代碼示例

計算 1 個時間步的 GRU 單元 back-propagation。

用法

tf.raw_ops.GRUBlockCellGrad(
    x, h_prev, w_ru, w_c, b_ru, b_c, r, u, c, d_h, name=None
)

參數

x 一個Tensor。必須是以下類型之一：float32。
h_prev 一個Tensor。必須與 x 具有相同的類型。
w_ru 一個Tensor。必須與 x 具有相同的類型。
w_c 一個Tensor。必須與 x 具有相同的類型。
b_ru 一個Tensor。必須與 x 具有相同的類型。
b_c 一個Tensor。必須與 x 具有相同的類型。
r 一個Tensor。必須與 x 具有相同的類型。
u 一個Tensor。必須與 x 具有相同的類型。
c 一個Tensor。必須與 x 具有相同的類型。
d_h 一個Tensor。必須與 x 具有相同的類型。
name 操作的名稱(可選)。

Tensor 對象的元組(d_x、d_h_prev、d_c_bar、d_r_bar_u_bar)。
d_x 一個Tensor。具有與 x 相同的類型。
d_h_prev 一個Tensor。具有與 x 相同的類型。
d_c_bar 一個Tensor。具有與 x 相同的類型。
d_r_bar_u_bar 一個Tensor。具有與 x 相同的類型。

Args x：GRU 單元的輸入。 h_prev：來自前一個 GRU 單元的狀態輸入。 w_ru：重置和更新門的權重矩陣。 w_c：單元連接門的權重矩陣。 b_ru：複位和更新門的偏置向量。 b_c：單元連接門的偏置向量。 r：複位門的輸出。 u：更新門的輸出。 c：單元連接門的輸出。 d_h：h_new wrt 到目標函數的梯度。

返回 d_x:x wrt 到目標函數的梯度。 d_h_prev：h wrt 到目標函數的梯度。 d_c_bar c_bar wrt 到目標函數的梯度。 d_r_bar_u_bar r_bar 和 u_bar 對目標函數的梯度。

這個內核操作實現了以下數學方程：

注意變量的符號：

a 和 b 的連接由 a_b 表示 a 和 b 的逐元素點積由 ab 表示逐元素點積由 \circ 表示矩陣乘法由 * 表示

為清楚起見的附加說明：

w_ru 可以分割成 4 個不同的矩陣。

w_ru = [w_r_x w_u_x
        w_r_h_prev w_u_h_prev]

同樣，w_c 可以分割成 2 個不同的矩陣。

w_c = [w_c_x w_c_h_prevr]

偏見也是如此。

b_ru = [b_ru_x b_ru_h]
b_c = [b_c_x b_c_h]

關於符號的另一個說明：

d_x = d_x_component_1 + d_x_component_2

where d_x_component_1 = d_r_bar * w_r_x^T + d_u_bar * w_r_x^T
and d_x_component_2 = d_c_bar * w_c_x^T

d_h_prev = d_h_prev_component_1 + d_h_prevr \circ r + d_h \circ u
where d_h_prev_componenet_1 = d_r_bar * w_r_h_prev^T + d_u_bar * w_r_h_prev^T

下麵梯度背後的數學：

d_c_bar = d_h \circ (1-u) \circ (1-c \circ c)
d_u_bar = d_h \circ (h-c) \circ u \circ (1-u)

d_r_bar_u_bar = [d_r_bar d_u_bar]

[d_x_component_1 d_h_prev_component_1] = d_r_bar_u_bar * w_ru^T

[d_x_component_2 d_h_prevr] = d_c_bar * w_c^T

d_x = d_x_component_1 + d_x_component_2

d_h_prev = d_h_prev_component_1 + d_h_prevr \circ r + u

下麵的計算是在梯度的 python 包裝器中執行的(不在梯度內核中。)

d_w_ru = x_h_prevr^T * d_c_bar

d_w_c = x_h_prev^T * d_r_bar_u_bar

d_b_ru = sum of d_r_bar_u_bar along axis = 0

d_b_c = sum of d_c_bar along axis = 0

相關用法

注：本文由純淨天空篩選整理自tensorflow.org大神的英文原創作品 tf.raw_ops.GRUBlockCellGrad。非經特殊聲明，原始代碼版權歸原作者所有，本譯文未經允許或授權，請勿轉載或複製。