Python tf.edit_distance用法及代碼示例

計算序列之間的 Levenshtein 距離。

用法

tf.edit_distance(
    hypothesis, truth, normalize=True, name='edit_distance'
)

參數

hypothesis 包含假設序列的SparseTensor。
truth 包含真值序列的SparseTensor。
normalize 一個bool。如果 True ，將 Levenshtein 距離歸一化為 truth. 的長度
name 操作的名稱(可選)。

一個密集的 Tensor 秩為 R - 1 ，其中 R 是 SparseTensor 輸入 hypothesis 和 truth 的秩。

拋出

TypeError 如果 hypothesis 或 truth 不是 SparseTensor 。

此操作采用可變長度序列(hypothesis 和 truth)，每個序列都作為 SparseTensor 提供，並計算 Levenshtein 距離。您可以通過將normalize 設置為true，按truth 的長度標準化編輯距離。

例如：

給定以下輸入，

hypothesis 是形狀為 [2, 1, 1] 的 tf.SparseTensor
truth 是形狀為 [2, 2, 2] 的 tf.SparseTensor

hypothesis = tf.SparseTensor(
  [[0, 0, 0],
   [1, 0, 0]],
  ["a", "b"],
  (2, 1, 1))
truth = tf.SparseTensor(
  [[0, 1, 0],
   [1, 0, 0],
   [1, 0, 1],
   [1, 1, 0]],
   ["a", "b", "c", "a"],
   (2, 2, 2))
tf.edit_distance(hypothesis, truth, normalize=True)
<tf.Tensor:shape=(2, 2), dtype=float32, numpy=
array([[inf, 1. ],
       [0.5, 1. ]], dtype=float32)>

該操作返回一個形狀為 [2, 2] 的密集張量，其編輯距離由 truth 長度標準化。

注意：可以計算具有可變長度值的兩個稀疏張量之間的編輯距離。但是，在啟用即刻執行時嘗試創建它們將導致 ValueError 。

對於以下輸入，

# 'hypothesis' is a tensor of shape `[2, 1]` with variable-length values:
#   (0,0) = ["a"]
#   (1,0) = ["b"]
hypothesis = tf.sparse.SparseTensor(
    [[0, 0, 0],
     [1, 0, 0]],
    ["a", "b"],
    (2, 1, 1))

# 'truth' is a tensor of shape `[2, 2]` with variable-length values:
#   (0,0) = []
#   (0,1) = ["a"]
#   (1,0) = ["b", "c"]
#   (1,1) = ["a"]
truth = tf.sparse.SparseTensor(
    [[0, 1, 0],
     [1, 0, 0],
     [1, 0, 1],
     [1, 1, 0]],
    ["a", "b", "c", "a"],
    (2, 2, 2))

normalize = True

# The output would be a dense Tensor of shape `(2,)`, with edit distances
normalized by 'truth' lengths.
# output => array([0., 0.5], dtype=float32)

相關用法

注：本文由純淨天空篩選整理自tensorflow.org大神的英文原創作品 tf.edit_distance。非經特殊聲明，原始代碼版權歸原作者所有，本譯文未經允許或授權，請勿轉載或複製。