Python tf.data.Dataset.cache用法及代碼示例

用法

cache(
    filename='', name=None
)

參數

filename tf.string 標量 tf.Tensor ，表示文件係統上用於緩存此數據集中元素的目錄名稱。如果未提供文件名，則數據集將緩存在內存中。
name (可選。) tf.data 操作的名稱。

Dataset 一個Dataset。

緩存此數據集中的元素。

第一次迭代數據集時，其元素將緩存在指定文件或內存中。隨後的迭代將使用緩存的數據。

注意：為了最終確定緩存，必須對輸入數據集進行整體迭代。否則，後續迭代將不會使用緩存數據。

dataset = tf.data.Dataset.range(5)
dataset = dataset.map(lambda x:x**2)
dataset = dataset.cache()
# The first time reading through the data will generate the data using
# `range` and `map`.
list(dataset.as_numpy_iterator())
[0, 1, 4, 9, 16]
# Subsequent iterations read from the cache.
list(dataset.as_numpy_iterator())
[0, 1, 4, 9, 16]

緩存到文件時，緩存的數據將在運行中持續存在。即使是數據的第一次迭代也會從緩存文件中讀取。在調用 .cache() 之前更改輸入管道將無效，直到刪除緩存文件或更改文件名。

dataset = tf.data.Dataset.range(5)
dataset = dataset.cache("/path/to/file")
list(dataset.as_numpy_iterator())
# [0, 1, 2, 3, 4]
dataset = tf.data.Dataset.range(10)
dataset = dataset.cache("/path/to/file")  # Same file!
list(dataset.as_numpy_iterator())
# [0, 1, 2, 3, 4]

注意： cache將在數據集的每次迭代期間產生完全相同的元素。如果您希望隨機化迭代順序，請確保調用shuffle 後調用cache.

相關用法

注：本文由純淨天空篩選整理自tensorflow.org大神的英文原創作品 tf.data.Dataset.cache。非經特殊聲明，原始代碼版權歸原作者所有，本譯文未經允許或授權，請勿轉載或複製。

用法

參數

返回