Python tf.data.TFRecordDataset用法及代码示例

Dataset 包含来自一个或多个 TFRecord 文件的记录。

继承自：Dataset

用法

tf.data.TFRecordDataset(
    filenames, compression_type=None, buffer_size=None, num_parallel_reads=None,
    name=None
)

参数

filenames 一个 tf.string 张量或 tf.data.Dataset 包含一个或多个文件名。
compression_type (可选。)tf.string 标量评估为 ""(无压缩)、"ZLIB" 或 "GZIP" 之一。
buffer_size (可选。)tf.int64 标量，表示读取缓冲区中的字节数。如果您的输入管道存在 I/O 瓶颈，请考虑将此参数设置为 1-100 MB 的值。如果 None ，则使用本地和远程文件系统的合理默认值。
num_parallel_reads (可选。)tf.int64 标量，表示要并行读取的文件数。如果大于1，则以交错的顺序输出并行读取的文件记录。如果您的输入管道存在 I/O 瓶颈，请考虑将此参数设置为大于 1 的值以并行化 I/O。如果 None ，文件将被顺序读取。
name (可选。) tf.data 操作的名称。

抛出

TypeError 如果任何参数不具有预期的类型。
ValueError 如果任何参数不具有预期的形状。

属性

element_spec 此数据集元素的类型规范。

dataset = tf.data.Dataset.from_tensor_slices([1, 2, 3])
dataset.element_spec
TensorSpec(shape=(), dtype=tf.int32, name=None)

如需更多信息，请阅读本指南。

这个数据集从文件中加载 TFRecords 作为字节，就像它们被写入一样。 TFRecordDataset 不会自行进行任何解析或解码。可以通过在 TFRecordDataset 之后应用 Dataset.map 转换来完成解析和解码。

下面给出了一个最小的例子：

import tempfile
example_path = os.path.join(tempfile.gettempdir(), "example.tfrecords")
np.random.seed(0)

# Write the records to a file.
with tf.io.TFRecordWriter(example_path) as file_writer:
  for _ in range(4):
    x, y = np.random.random(), np.random.random()

    record_bytes = tf.train.Example(features=tf.train.Features(feature={
        "x":tf.train.Feature(float_list=tf.train.FloatList(value=[x])),
        "y":tf.train.Feature(float_list=tf.train.FloatList(value=[y])),
    })).SerializeToString()
    file_writer.write(record_bytes)

# Read the data back out.
def decode_fn(record_bytes):
  return tf.io.parse_single_example(
      # Data
      record_bytes,

      # Schema
      {"x":tf.io.FixedLenFeature([], dtype=tf.float32),
       "y":tf.io.FixedLenFeature([], dtype=tf.float32)}
  )

for batch in tf.data.TFRecordDataset([example_path]).map(decode_fn):
  print("x = {x:.4f},  y = {y:.4f}".format(**batch))
x = 0.5488,  y = 0.7152
x = 0.6028,  y = 0.5449
x = 0.4237,  y = 0.6459
x = 0.4376,  y = 0.8918

相关用法

注：本文由纯净天空筛选整理自tensorflow.org大神的英文原创作品 tf.data.TFRecordDataset。非经特殊声明，原始代码版权归原作者所有，本译文未经允许或授权，请勿转载或复制。