Python tf.edit_distance用法及代码示例

计算序列之间的 Levenshtein 距离。

用法

tf.edit_distance(
    hypothesis, truth, normalize=True, name='edit_distance'
)

参数

hypothesis 包含假设序列的SparseTensor。
truth 包含真值序列的SparseTensor。
normalize 一个bool。如果 True ，将 Levenshtein 距离归一化为 truth. 的长度
name 操作的名称(可选)。

一个密集的 Tensor 秩为 R - 1 ，其中 R 是 SparseTensor 输入 hypothesis 和 truth 的秩。

抛出

TypeError 如果 hypothesis 或 truth 不是 SparseTensor 。

此操作采用可变长度序列(hypothesis 和 truth)，每个序列都作为 SparseTensor 提供，并计算 Levenshtein 距离。您可以通过将normalize 设置为true，按truth 的长度标准化编辑距离。

例如：

给定以下输入，

hypothesis 是形状为 [2, 1, 1] 的 tf.SparseTensor
truth 是形状为 [2, 2, 2] 的 tf.SparseTensor

hypothesis = tf.SparseTensor(
  [[0, 0, 0],
   [1, 0, 0]],
  ["a", "b"],
  (2, 1, 1))
truth = tf.SparseTensor(
  [[0, 1, 0],
   [1, 0, 0],
   [1, 0, 1],
   [1, 1, 0]],
   ["a", "b", "c", "a"],
   (2, 2, 2))
tf.edit_distance(hypothesis, truth, normalize=True)
<tf.Tensor:shape=(2, 2), dtype=float32, numpy=
array([[inf, 1. ],
       [0.5, 1. ]], dtype=float32)>

该操作返回一个形状为 [2, 2] 的密集张量，其编辑距离由 truth 长度标准化。

注意：可以计算具有可变长度值的两个稀疏张量之间的编辑距离。但是，在启用即刻执行时尝试创建它们将导致 ValueError 。

对于以下输入，

# 'hypothesis' is a tensor of shape `[2, 1]` with variable-length values:
#   (0,0) = ["a"]
#   (1,0) = ["b"]
hypothesis = tf.sparse.SparseTensor(
    [[0, 0, 0],
     [1, 0, 0]],
    ["a", "b"],
    (2, 1, 1))

# 'truth' is a tensor of shape `[2, 2]` with variable-length values:
#   (0,0) = []
#   (0,1) = ["a"]
#   (1,0) = ["b", "c"]
#   (1,1) = ["a"]
truth = tf.sparse.SparseTensor(
    [[0, 1, 0],
     [1, 0, 0],
     [1, 0, 1],
     [1, 1, 0]],
    ["a", "b", "c", "a"],
    (2, 2, 2))

normalize = True

# The output would be a dense Tensor of shape `(2,)`, with edit distances
normalized by 'truth' lengths.
# output => array([0., 0.5], dtype=float32)

相关用法

注：本文由纯净天空筛选整理自tensorflow.org大神的英文原创作品 tf.edit_distance。非经特殊声明，原始代码版权归原作者所有，本译文未经允许或授权，请勿转载或复制。