Python PyTorch Embedding用法及代碼示例

本文簡要介紹python語言中 torch.nn.Embedding 的用法。

用法: class torch.nn.Embedding(num_embeddings, embedding_dim, padding_idx=None, max_norm=None, norm_type=2.0, scale_grad_by_freq=False, sparse=False, _weight=None, device=None, dtype=None)

參數：

num_embeddings(int) -嵌入字典的大小
embedding_dim(int) -每個嵌入向量的大小
padding_idx(int,可選的) -如果指定，padding_idx 處的條目不會影響梯度；因此，padding_idx 處的嵌入向量在訓練期間不會更新，即它保持為固定的 “pad”。對於新構建的嵌入，padding_idx 處的嵌入向量將默認為全零，但可以更新為另一個值以用作填充向量。
max_norm(float,可選的) -如果給定，則範數大於 max_norm 的每個嵌入向量被重新規範化為具有範數 max_norm 。
norm_type(float,可選的) -p-norm 的 p 為 max_norm 選項計算。默認 2 。
scale_grad_by_freq(布爾值,可選的) -如果給定，這將通過小批量中單詞頻率的倒數來縮放梯度。默認 False 。
sparse(bool,可選的) -如果 True ，梯度 w.r.t. weight 矩陣將是一個稀疏張量。有關稀疏漸變的更多詳細信息，請參閱注釋。

變量：

~Embedding.weight(Tensor) -從 \mathcal{N}(0, 1) 初始化的形狀模塊 (num_embeddings, embedding_dim) 的可學習權重

一個簡單的查找表，用於存儲固定字典和大小的嵌入。

該模塊通常用於存儲詞嵌入並使用索引檢索它們。模塊的輸入是索引列表，輸出是相應的詞嵌入。

形狀：

輸入：(*)、IntTensor 或 LongTensor 任意形狀，包含要提取的索引
輸出：(*, H)，其中 * 是輸入形狀，H=\text{embedding\_dim}

注意

請記住，隻有有限數量的優化器支持稀疏梯度：目前是 optim.SGD(CUDA 和 CPU)、optim.SparseAdam(CUDA 和 CPU)和 optim.Adagrad(CPU)

注意

當max_norm不是None時，Embedding的forward方法將就地修改weight張量。由於梯度計算所需的張量無法就地修改，因此在調用 Embedding 的前向方法之前對 Embedding.weight 執行可微分操作需要在 max_norm 不是 None 時克隆 Embedding.weight 。例如：

n, d, m = 3, 5, 7
embedding = nn.Embedding(n, d, max_norm=True)
W = torch.randn((m, d), requires_grad=True)
idx = torch.tensor([1, 2])
a = embedding.weight.clone() @ W.t()  # weight must be cloned for this to be differentiable
b = embedding(idx) @ W.t()  # modifies weight in-place
out = (a.unsqueeze(0) + b.unsqueeze(1))
loss = out.sigmoid().prod()
loss.backward()

例子：

>>> # an Embedding module containing 10 tensors of size 3
>>> embedding = nn.Embedding(10, 3)
>>> # a batch of 2 samples of 4 indices each
>>> input = torch.LongTensor([[1,2,4,5],[4,3,2,9]])
>>> embedding(input)
tensor([[[-0.0251, -1.6902,  0.7172],
         [-0.6431,  0.0748,  0.6969],
         [ 1.4970,  1.3448, -0.9685],
         [-0.3677, -2.7265, -0.1685]],

        [[ 1.4970,  1.3448, -0.9685],
         [ 0.4362, -0.4004,  0.9400],
         [-0.6431,  0.0748,  0.6969],
         [ 0.9124, -2.3616,  1.1151]]])


>>> # example with padding_idx
>>> embedding = nn.Embedding(10, 3, padding_idx=0)
>>> input = torch.LongTensor([[0,2,0,5]])
>>> embedding(input)
tensor([[[ 0.0000,  0.0000,  0.0000],
         [ 0.1535, -2.0309,  0.9315],
         [ 0.0000,  0.0000,  0.0000],
         [-0.1655,  0.9897,  0.0635]]])

>>> # example of changing `pad` vector
>>> padding_idx = 0
>>> embedding = nn.Embedding(3, 3, padding_idx=padding_idx)
>>> embedding.weight
Parameter containing:
tensor([[ 0.0000,  0.0000,  0.0000],
        [-0.7895, -0.7089, -0.0364],
        [ 0.6778,  0.5803,  0.2678]], requires_grad=True)
>>> with torch.no_grad():
...     embedding.weight[padding_idx] = torch.ones(3)
>>> embedding.weight
Parameter containing:
tensor([[ 1.0000,  1.0000,  1.0000],
        [-0.7895, -0.7089, -0.0364],
        [ 0.6778,  0.5803,  0.2678]], requires_grad=True)

相關用法

注：本文由純淨天空篩選整理自pytorch.org大神的英文原創作品 torch.nn.Embedding。非經特殊聲明，原始代碼版權歸原作者所有，本譯文未經允許或授權，請勿轉載或複製。