Python PyTorch use_deterministic_algorithms用法及代碼示例

本文簡要介紹python語言中 torch.use_deterministic_algorithms 的用法。

用法: torch.use_deterministic_algorithms(mode)

參數：

mode(bool) -如果為 True，則使潛在的非確定性操作切換到確定性算法或引發運行時錯誤。如果為 False，則允許非確定性操作。

設置PyTorch操作是否必須使用“deterministic”算法。也就是說，給定相同的輸入並且在相同的軟件和硬件上運行時，算法總是產生相同的輸出。啟用後，操作將在可用時使用確定性算法，如果隻有非確定性算法可用，則在調用時將拋出 RuntimeError 。

以下 normally-nondeterministic 操作將在 mode=True 時確定性地執行：

torch.nn.Conv1d 在 CUDA 張量上調用時
torch.nn.Conv2d 在 CUDA 張量上調用時
torch.nn.Conv3d 在 CUDA 張量上調用時
torch.nn.ConvTranspose1d 在 CUDA 張量上調用時
torch.nn.ConvTranspose2d 在 CUDA 張量上調用時
torch.nn.ConvTranspose3d 在 CUDA 張量上調用時
torch.bmm() 在 sparse-dense CUDA 張量上調用時
torch.Tensor.__getitem__() 嘗試區分 CPU 張量並且索引是張量列表時
torch.Tensor.index_put() 與 accumulate=False
torch.Tensor.index_put() 和 accumulate=True 在 CPU 張量上調用時
torch.Tensor.put_() 和 accumulate=True 在 CPU 張量上調用時
torch.Tensor.scatter_add_() 當 input 維度為一並且在 CUDA 張量上調用
torch.gather() 當input 維度為一並且在需要 grad 的 CUDA 張量上調用
torch.index_add() 在 CUDA 張量上調用時
torch.index_select() 嘗試區分 CUDA 張量時
torch.repeat_interleave() 嘗試區分 CUDA 張量時
torch.Tensor.index_copy() 在 CPU 或 CUDA 張量上調用時

以下 normally-nondeterministic 操作將在 mode=True 時拋出 RuntimeError ：

torch.nn.AvgPool3d 嘗試區分 CUDA 張量時
torch.nn.AdaptiveAvgPool2d 嘗試區分 CUDA 張量時
torch.nn.AdaptiveAvgPool3d 嘗試區分 CUDA 張量時
torch.nn.MaxPool3d 嘗試區分 CUDA 張量時
torch.nn.AdaptiveMaxPool2d 嘗試區分 CUDA 張量時
torch.nn.FractionalMaxPool2d 嘗試區分 CUDA 張量時
torch.nn.FractionalMaxPool3d 嘗試區分 CUDA 張量時
torch.nn.functional.interpolate() 嘗試區分 CUDA 張量並使用以下模式之一時：
- linear
- bilinear
- bicubic
- trilinear
torch.nn.ReflectionPad1d 嘗試區分 CUDA 張量時
torch.nn.ReflectionPad2d 嘗試區分 CUDA 張量時
torch.nn.ReflectionPad3d 嘗試區分 CUDA 張量時
torch.nn.ReplicationPad1d 嘗試區分 CUDA 張量時
torch.nn.ReplicationPad2d 嘗試區分 CUDA 張量時
torch.nn.ReplicationPad3d 嘗試區分 CUDA 張量時
torch.nn.NLLLoss 在 CUDA 張量上調用時
torch.nn.CTCLoss 嘗試區分 CUDA 張量時
torch.nn.EmbeddingBag 當mode='max' 嘗試微分 CUDA 張量時
torch.Tensor.scatter_add_() 當 input 維度大於 1 並在 CUDA 張量上調用
torch.gather() 當 input 維度大於 1 並調用需要 grad 的 CUDA 張量
torch.Tensor.put_() 當accumulate=False
torch.Tensor.put_() 當 accumulate=True 並調用 CUDA 張量
torch.histc() 在 CUDA 張量上調用時
torch.bincount() 在 CUDA 張量上調用時
torch.kthvalue() 調用 CUDA 張量
torch.median() 在 CUDA 張量上調用時帶有索引輸出
torch.nn.functional.grid_sample() 嘗試區分 CUDA 張量時

如果 CUDA 版本為 10.2 或更高版本，則少數 CUDA 操作是不確定的，除非設置了環境變量 CUBLAS_WORKSPACE_CONFIG=:4096:8 或 CUBLAS_WORKSPACE_CONFIG=:16:8。有關更多詳細信息，請參閱 CUDA 文檔： https://docs.nvidia.com/cuda/cublas/index.html#cublasApi_reproducibility 如果未設置這些環境變量配置之一，則在使用 CUDA 張量調用時，將從這些操作中引發 RuntimeError ：

請注意，確定性操作的性能往往比非確定性操作差。

注意

此標誌不會檢測或防止由於在具有內部內存重疊的張量上調用就地操作或通過將此類張量作為操作的 out 參數而引起的非確定性行為。在這些情況下，不同數據的多次寫入可能針對單個內存位置，並且無法保證寫入的順序。

例子：

>>> torch.use_deterministic_algorithms(True)

# Forward mode nondeterministic error
>>> torch.randn(10).index_copy(0, torch.tensor([0]), torch.randn(1))
...
RuntimeError: index_copy does not have a deterministic implementation...

# Backward mode nondeterministic error
>>> torch.randn(10, requires_grad=True, device='cuda').index_select(0, torch.tensor([0], device='cuda')).backward()
...
RuntimeError: index_add_cuda_ does not have a deterministic implementation...

相關用法

注：本文由純淨天空篩選整理自pytorch.org大神的英文原創作品 torch.use_deterministic_algorithms。非經特殊聲明，原始代碼版權歸原作者所有，本譯文未經允許或授權，請勿轉載或複製。