Python tf.data.TextLineDataset.from_tensor_slices用法及代码示例

用法

@staticmethod
from_tensor_slices(
    tensors, name=None
)

参数

tensors 一个数据集元素，其组件具有相同的第一维。记录了支持的值这里.
name (可选。) tf.data 操作的名称。

Dataset 一个Dataset。

创建一个Dataset，其元素是给定张量的切片。

给定的张量沿它们的第一维进行切片。此操作保留输入张量的结构，删除每个张量的第一个维度并将其用作数据集维度。所有输入张量的第一个维度必须具有相同的大小。

# Slicing a 1D tensor produces scalar tensor elements.
dataset = tf.data.Dataset.from_tensor_slices([1, 2, 3])
list(dataset.as_numpy_iterator())
[1, 2, 3]

# Slicing a 2D tensor produces 1D tensor elements.
dataset = tf.data.Dataset.from_tensor_slices([[1, 2], [3, 4]])
list(dataset.as_numpy_iterator())
[array([1, 2], dtype=int32), array([3, 4], dtype=int32)]

# Slicing a tuple of 1D tensors produces tuple elements containing
# scalar tensors.
dataset = tf.data.Dataset.from_tensor_slices(([1, 2], [3, 4], [5, 6]))
list(dataset.as_numpy_iterator())
[(1, 3, 5), (2, 4, 6)]

# Dictionary structure is also preserved.
dataset = tf.data.Dataset.from_tensor_slices({"a":[1, 2], "b":[3, 4]})
list(dataset.as_numpy_iterator()) == [{'a':1, 'b':3},
                                      {'a':2, 'b':4}]
True

# Two tensors can be combined into one Dataset object.
features = tf.constant([[1, 3], [2, 1], [3, 3]]) # ==> 3x2 tensor
labels = tf.constant(['A', 'B', 'A']) # ==> 3x1 tensor
dataset = Dataset.from_tensor_slices((features, labels))
# Both the features and the labels tensors can be converted
# to a Dataset object separately and combined after.
features_dataset = Dataset.from_tensor_slices(features)
labels_dataset = Dataset.from_tensor_slices(labels)
dataset = Dataset.zip((features_dataset, labels_dataset))
# A batched feature and label set can be converted to a Dataset
# in similar fashion.
batched_features = tf.constant([[[1, 3], [2, 3]],
                                [[2, 1], [1, 2]],
                                [[3, 3], [3, 2]]], shape=(3, 2, 2))
batched_labels = tf.constant([['A', 'A'],
                              ['B', 'B'],
                              ['A', 'B']], shape=(3, 2, 1))
dataset = Dataset.from_tensor_slices((batched_features, batched_labels))
for element in dataset.as_numpy_iterator():
  print(element)
(array([[1, 3],
       [2, 3]], dtype=int32), array([[b'A'],
       [b'A']], dtype=object))
(array([[2, 1],
       [1, 2]], dtype=int32), array([[b'B'],
       [b'B']], dtype=object))
(array([[3, 3],
       [3, 2]], dtype=int32), array([[b'A'],
       [b'B']], dtype=object))

请注意，如果 tensors 包含 NumPy 数组，并且未启用即刻执行，则这些值将作为一个或多个 tf.constant 操作嵌入到图中。对于大型数据集(> 1 GB)，这可能会浪费内存并遇到图形序列化的字节限制。如果 tensors 包含一个或多个大型 NumPy 数组，请考虑本指南中说明的替代方法。

相关用法

注：本文由纯净天空筛选整理自tensorflow.org大神的英文原创作品 tf.data.TextLineDataset.from_tensor_slices。非经特殊声明，原始代码版权归原作者所有，本译文未经允许或授权，请勿转载或复制。

用法

参数

返回