Python tf.sparse.bincount用法及代码示例

计算整数值在张量中出现的次数。

用法

tf.sparse.bincount(
    values, weights=None, axis=0, minlength=None, maxlength=None,
    binary_output=False, name=None
)

参数

values 应计算其值的张量、RaggedTensor 或 SparseTensor。如果 axis=-1 ，这些张量的秩必须为 2。
weights 如果非无，则必须与 arr 具有相同的形状。对于 value 中的每个值，bin 将按相应的权重而不是 1 递增。
axis 要切片的轴。 axis 及以下的轴将在 bin 计数之前展平。目前，仅支持 0 和 -1。如果没有，所有轴都将被展平(与传递 0 相同)。
minlength 如果给定，确保输出的长度至少为 minlength ，必要时在末尾填充零。
maxlength 如果给定，则跳过 values 中等于或大于 maxlength 的值，确保输出的长度最多为 maxlength 。
binary_output 如果为 True，此操作将输出 1 而不是令牌出现的次数(相当于 one_hot + reduce_any 而不是 one_hot + reduce_add)。默认为假。
name 此操作的名称。

一个稀疏张量output.shape = values.shape[:axis] + [N]，其中N是
- maxlength(如果设置)；
- minlength(如果设置，和 minlength > reduce_max(values) )；
- 0(如果values为空)；
- reduce_max(values) + 1 否则。

此操作采用 N 维 Tensor , RaggedTensor 或 SparseTensor 并返回 N 维 int64 SparseTensor，其中元素 [i0...i[axis], j] 包含值 j 在输入张量的切片 [i0...i[axis],:] 中出现的次数。目前，仅支持 N=0 和 N=-1。

例子：

Bin-counting 单个批次中的每个项目

此示例接受一个输入(可以是张量、RaggedTensor 或 SparseTensor)并返回一个 SparseTensor，其中 (i,j) 的值是值 j 在批次 i 中出现的次数。

data = np.array([[10, 20, 30, 20], [11, 101, 11, 10001]], dtype=np.int64)
output = tf.sparse.bincount(data, axis=-1)
print(output)
SparseTensor(indices=tf.Tensor(
[[    0    10]
 [    0    20]
 [    0    30]
 [    1    11]
 [    1   101]
 [    1 10001]], shape=(6, 2), dtype=int64),
 values=tf.Tensor([1 2 1 2 1 1], shape=(6,), dtype=int64),
 dense_shape=tf.Tensor([    2 10002], shape=(2,), dtype=int64))

Bin-counting 具有定义的输出形状

此示例接受一个输入(可以是张量、RaggedTensor 或 SparseTensor)并返回一个 SparseTensor，其中 (i,j) 的值是值 j 在批次 i 中出现的次数。但是，所有高于 'maxlength' 的 j 值都将被忽略。输出稀疏张量的dense_shape设置为'minlength'。请注意，虽然输入与上面的示例相同，但批次项目 2 中的值 '10001' 被删除，密集形状为 [2, 500] 而不是 [2,10002] 或 [2, 102]。

minlength = maxlength = 500
data = np.array([[10, 20, 30, 20], [11, 101, 11, 10001]], dtype=np.int64)
output = tf.sparse.bincount(
   data, axis=-1, minlength=minlength, maxlength=maxlength)
print(output)
SparseTensor(indices=tf.Tensor(
[[  0  10]
 [  0  20]
 [  0  30]
 [  1  11]
 [  1 101]], shape=(5, 2), dtype=int64),
 values=tf.Tensor([1 2 1 2 1], shape=(5,), dtype=int64),
 dense_shape=tf.Tensor([  2 500], shape=(2,), dtype=int64))

二进制bin-counting

此示例接受一个输入(可以是张量、RaggedTensor 或 SparseTensor)并返回一个 SparseTensor，如果值 j 在批次 i 中至少出现一次，则 (i,j) 为 1，否则为 0。请注意，即使某些值(如第 1 批中的 20 和第 2 批中的 11)出现不止一次，'values' 张量全为 1。

data = np.array([[10, 20, 30, 20], [11, 101, 11, 10001]], dtype=np.int64)
output = tf.sparse.bincount(data, binary_output=True, axis=-1)
print(output)
SparseTensor(indices=tf.Tensor(
[[    0    10]
 [    0    20]
 [    0    30]
 [    1    11]
 [    1   101]
 [    1 10001]], shape=(6, 2), dtype=int64),
 values=tf.Tensor([1 1 1 1 1 1], shape=(6,), dtype=int64),
 dense_shape=tf.Tensor([    2 10002], shape=(2,), dtype=int64))

加权bin-counting

这个例子有两个输入——一个值张量和一个权重张量。这些张量必须具有相同的形状，并且在 RaggedTensors 或 SparseTensors 的情况下具有相同的行拆分或索引。当执行加权计数时，op 将输出一个 SparseTensor，其中 (i, j) 的值是权重张量的批次 i 在值张量具有值 j 的位置中的值的总和。在这种情况下，输出 dtype 与权重张量的 dtype 相同。

data = np.array([[10, 20, 30, 20], [11, 101, 11, 10001]], dtype=np.int64)
weights = [[2, 0.25, 15, 0.5], [2, 17, 3, 0.9]]
output = tf.sparse.bincount(data, weights=weights, axis=-1)
print(output)
SparseTensor(indices=tf.Tensor(
[[    0    10]
 [    0    20]
 [    0    30]
 [    1    11]
 [    1   101]
 [    1 10001]], shape=(6, 2), dtype=int64),
 values=tf.Tensor([2. 0.75 15. 5. 17. 0.9], shape=(6,), dtype=float32),
 dense_shape=tf.Tensor([    2 10002], shape=(2,), dtype=int64))

相关用法

注：本文由纯净天空筛选整理自tensorflow.org大神的英文原创作品 tf.sparse.bincount。非经特殊声明，原始代码版权归原作者所有，本译文未经允许或授权，请勿转载或复制。