Python tf.distribute.experimental.TPUStrategy用法及代碼示例

TPU 和 TPU Pod 的同步訓練。

繼承自：Strategy

用法

tf.distribute.experimental.TPUStrategy(
    tpu_cluster_resolver=None, device_assignment=None
)

參數

tpu_cluster_resolver 一個 tf.distribute.cluster_resolver.TPUClusterResolver，它提供有關 TPU 集群的信息。
device_assignment 可選 tf.tpu.experimental.DeviceAssignment 指定副本在 TPU 集群上的位置。

屬性

cluster_resolver 返回與此策略關聯的集群解析器。
tf.distribute.experimental.TPUStrategy 提供關聯的 tf.distribute.cluster_resolver.ClusterResolver 。如果用戶在 __init__ 中提供了一個，則返回該實例；如果用戶沒有，則提供默認的tf.distribute.cluster_resolver.TPUClusterResolver。
extended tf.distribute.StrategyExtended 與其他方法。
num_replicas_in_sync 返回聚合梯度的副本數。

要構造 TPUStrategy 對象，您需要運行如下初始化代碼：

resolver = tf.distribute.cluster_resolver.TPUClusterResolver(tpu='')
tf.config.experimental_connect_to_cluster(resolver)
tf.tpu.experimental.initialize_tpu_system(resolver)
strategy = tf.distribute.experimental.TPUStrategy(resolver)

在使用分發策略時，在策略範圍內創建的變量將在所有副本中複製，並且可以使用 all-reduce 算法保持同步。

要在 TPU 上運行 TF2 程序，您可以將 tf.keras 中的 .compile 和 .fit API 與 TPUStrategy 一起使用，或者通過直接調用 strategy.run 來編寫自己的自定義訓練循環。請注意，TPUStrategy 不支持純粹的 Eager 執行，因此請確保傳遞給 strategy.run 的函數是 tf.function 或 strategy.run 如果啟用了 Eager 行為，則在 tf.function 內調用。

相關用法

注：本文由純淨天空篩選整理自tensorflow.org大神的英文原創作品 tf.distribute.experimental.TPUStrategy。非經特殊聲明，原始代碼版權歸原作者所有，本譯文未經允許或授權，請勿轉載或複製。