Python PyTorch SequenceEmbeddingAllToAll用法及代碼示例

本文簡要介紹python語言中 torchrec.distributed.dist_data.SequenceEmbeddingAllToAll 的用法。

用法: class torchrec.distributed.dist_data.SequenceEmbeddingAllToAll(pg: torch._C._distributed_c10d.ProcessGroup, features_per_rank: List[int], device: Optional[torch.device] = None)

參數：

pg(dist.ProcessGroup) -AlltoAll 通信發生的進程組。
features_per_rank(List[int]) -每個等級的特征數量列表。
device(可選的[torch.device]) -將分配緩衝區的設備。

基礎：torch.nn.modules.module.Module

根據拆分將序列嵌入重新分配到ProcessGroup。

例子：

init_distributed(rank=rank, size=2, backend="nccl")
pg = dist.new_group(backend="nccl")
features_per_rank = [4, 4]
m = SequenceEmbeddingAllToAll(pg, features_per_rank)
local_embs = torch.rand((6, 2))
sharding_ctx: SequenceShardingContext
output = m(
    local_embs=local_embs,
    lengths=sharding_ctx.lengths_after_input_dist,
    input_splits=sharding_ctx.input_splits,
    output_splits=sharding_ctx.output_splits,
    unbucketize_permute_tensor=None,
)
tensor = output.wait()

相關用法

注：本文由純淨天空篩選整理自pytorch.org大神的英文原創作品 torchrec.distributed.dist_data.SequenceEmbeddingAllToAll。非經特殊聲明，原始代碼版權歸原作者所有，本譯文未經允許或授權，請勿轉載或複製。