Python tf.sparse.bincount用法及代碼示例

計算整數值在張量中出現的次數。

用法

tf.sparse.bincount(
    values, weights=None, axis=0, minlength=None, maxlength=None,
    binary_output=False, name=None
)

參數

values 應計算其值的張量、RaggedTensor 或 SparseTensor。如果 axis=-1 ，這些張量的秩必須為 2。
weights 如果非無，則必須與 arr 具有相同的形狀。對於 value 中的每個值，bin 將按相應的權重而不是 1 遞增。
axis 要切片的軸。 axis 及以下的軸將在 bin 計數之前展平。目前，僅支持 0 和 -1。如果沒有，所有軸都將被展平(與傳遞 0 相同)。
minlength 如果給定，確保輸出的長度至少為 minlength ，必要時在末尾填充零。
maxlength 如果給定，則跳過 values 中等於或大於 maxlength 的值，確保輸出的長度最多為 maxlength 。
binary_output 如果為 True，此操作將輸出 1 而不是令牌出現的次數(相當於 one_hot + reduce_any 而不是 one_hot + reduce_add)。默認為假。
name 此操作的名稱。

一個稀疏張量output.shape = values.shape[:axis] + [N]，其中N是
- maxlength(如果設置)；
- minlength(如果設置，和 minlength > reduce_max(values) )；
- 0(如果values為空)；
- reduce_max(values) + 1 否則。

此操作采用 N 維 Tensor , RaggedTensor 或 SparseTensor 並返回 N 維 int64 SparseTensor，其中元素 [i0...i[axis], j] 包含值 j 在輸入張量的切片 [i0...i[axis],:] 中出現的次數。目前，僅支持 N=0 和 N=-1。

例子：

Bin-counting 單個批次中的每個項目

此示例接受一個輸入(可以是張量、RaggedTensor 或 SparseTensor)並返回一個 SparseTensor，其中 (i,j) 的值是值 j 在批次 i 中出現的次數。

data = np.array([[10, 20, 30, 20], [11, 101, 11, 10001]], dtype=np.int64)
output = tf.sparse.bincount(data, axis=-1)
print(output)
SparseTensor(indices=tf.Tensor(
[[    0    10]
 [    0    20]
 [    0    30]
 [    1    11]
 [    1   101]
 [    1 10001]], shape=(6, 2), dtype=int64),
 values=tf.Tensor([1 2 1 2 1 1], shape=(6,), dtype=int64),
 dense_shape=tf.Tensor([    2 10002], shape=(2,), dtype=int64))

Bin-counting 具有定義的輸出形狀

此示例接受一個輸入(可以是張量、RaggedTensor 或 SparseTensor)並返回一個 SparseTensor，其中 (i,j) 的值是值 j 在批次 i 中出現的次數。但是，所有高於 'maxlength' 的 j 值都將被忽略。輸出稀疏張量的dense_shape設置為'minlength'。請注意，雖然輸入與上麵的示例相同，但批次項目 2 中的值 '10001' 被刪除，密集形狀為 [2, 500] 而不是 [2,10002] 或 [2, 102]。

minlength = maxlength = 500
data = np.array([[10, 20, 30, 20], [11, 101, 11, 10001]], dtype=np.int64)
output = tf.sparse.bincount(
   data, axis=-1, minlength=minlength, maxlength=maxlength)
print(output)
SparseTensor(indices=tf.Tensor(
[[  0  10]
 [  0  20]
 [  0  30]
 [  1  11]
 [  1 101]], shape=(5, 2), dtype=int64),
 values=tf.Tensor([1 2 1 2 1], shape=(5,), dtype=int64),
 dense_shape=tf.Tensor([  2 500], shape=(2,), dtype=int64))

二進製bin-counting

此示例接受一個輸入(可以是張量、RaggedTensor 或 SparseTensor)並返回一個 SparseTensor，如果值 j 在批次 i 中至少出現一次，則 (i,j) 為 1，否則為 0。請注意，即使某些值(如第 1 批中的 20 和第 2 批中的 11)出現不止一次，'values' 張量全為 1。

data = np.array([[10, 20, 30, 20], [11, 101, 11, 10001]], dtype=np.int64)
output = tf.sparse.bincount(data, binary_output=True, axis=-1)
print(output)
SparseTensor(indices=tf.Tensor(
[[    0    10]
 [    0    20]
 [    0    30]
 [    1    11]
 [    1   101]
 [    1 10001]], shape=(6, 2), dtype=int64),
 values=tf.Tensor([1 1 1 1 1 1], shape=(6,), dtype=int64),
 dense_shape=tf.Tensor([    2 10002], shape=(2,), dtype=int64))

加權bin-counting

這個例子有兩個輸入——一個值張量和一個權重張量。這些張量必須具有相同的形狀，並且在 RaggedTensors 或 SparseTensors 的情況下具有相同的行拆分或索引。當執行加權計數時，op 將輸出一個 SparseTensor，其中 (i, j) 的值是權重張量的批次 i 在值張量具有值 j 的位置中的值的總和。在這種情況下，輸出 dtype 與權重張量的 dtype 相同。

data = np.array([[10, 20, 30, 20], [11, 101, 11, 10001]], dtype=np.int64)
weights = [[2, 0.25, 15, 0.5], [2, 17, 3, 0.9]]
output = tf.sparse.bincount(data, weights=weights, axis=-1)
print(output)
SparseTensor(indices=tf.Tensor(
[[    0    10]
 [    0    20]
 [    0    30]
 [    1    11]
 [    1   101]
 [    1 10001]], shape=(6, 2), dtype=int64),
 values=tf.Tensor([2. 0.75 15. 5. 17. 0.9], shape=(6,), dtype=float32),
 dense_shape=tf.Tensor([    2 10002], shape=(2,), dtype=int64))

相關用法

注：本文由純淨天空篩選整理自tensorflow.org大神的英文原創作品 tf.sparse.bincount。非經特殊聲明，原始代碼版權歸原作者所有，本譯文未經允許或授權，請勿轉載或複製。