當前位置: 首頁>>代碼示例 >>用法及示例精選 >>正文


Python PyTorch TwRwSparseFeaturesDist用法及代碼示例


本文簡要介紹python語言中 torchrec.distributed.sharding.twrw_sharding.TwRwSparseFeaturesDist 的用法。

用法:

class torchrec.distributed.sharding.twrw_sharding.TwRwSparseFeaturesDist(pg: torch._C._distributed_c10d.ProcessGroup, intra_pg: torch._C._distributed_c10d.ProcessGroup, id_list_features_per_rank: List[int], id_score_list_features_per_rank: List[int], id_list_feature_hash_sizes: List[int], id_score_list_feature_hash_sizes: List[int], device: Optional[torch.device] = None, has_feature_processor: bool = False)

基礎:torchrec.distributed.embedding_sharding.BaseSparseFeaturesDist[torchrec.distributed.embedding_types.SparseFeatures]

以 TWRW 方式對稀疏特征進行分桶,然後使用 AlltoAll 集體操作重新分配。

構造函數參數:

pg (dist.ProcessGroup): ProcessGroup 用於AlltoAll 通信。 intra_pg (dist.ProcessGroup): ProcessGroup 在 AlltoAll 的單個主機組內

溝通。

id_list_features_per_rank (List[int]):要發送到的 id 列表特征的數量

每個等級。

id_score_list_features_per_rank (List[int]): id score list features to

發送到每個等級

id_list_feature_hash_sizes (List[int]):id 列表特征的哈希大小。 id_score_list_feature_hash_sizes (List[int]): id score list features的哈希大小。設備(可選[torch.device]):將分配緩衝區的設備。 has_feature_processor (bool): 特征處理器的存在(即位置

加權特征)。

例子:

3 features
2 hosts with 2 devices each

Bucketize each feature into 2 buckets
Staggered shuffle with feature splits [2, 1]
AlltoAll operation

NOTE: result of staggered shuffle and AlltoAll operation look the same after
reordering in AlltoAll

Result:
    host 0 device 0:
        feature 0 bucket 0
        feature 1 bucket 0

    host 0 device 1:
        feature 0 bucket 1
        feature 1 bucket 1

    host 1 device 0:
        feature 2 bucket 0

    host 1 device 1:
        feature 2 bucket 1

相關用法


注:本文由純淨天空篩選整理自pytorch.org大神的英文原創作品 torchrec.distributed.sharding.twrw_sharding.TwRwSparseFeaturesDist。非經特殊聲明,原始代碼版權歸原作者所有,本譯文未經允許或授權,請勿轉載或複製。