Python dask.array.map_overlap用法及代碼示例

用法:
dask.array.map_overlap(func, *args, depth=None, boundary=None, trim=True, align_arrays=True, **kwargs)

在有一些重疊的數組塊上映射一個函數

我們在數組的塊之間共享相鄰區域，映射一個函數，然後修剪掉相鄰的條帶。如果深度大於沿特定軸的任何塊，則重新分塊數組。

請注意，此函數會在計算前嘗試自動確定輸出數組類型，如果您希望該函數在對 0-d 數組進行操作時不會成功，請參閱map_blocks 中的meta 關鍵字參數。

參數：

func: function：: 應用於每個擴展塊的函數。如果提供了多個數組，那麽函數應該期望以相同的順序接收每個數組的塊。
args：暗陣列
depth: int, tuple, dict or list：: 每個塊應與其鄰居共享的元素數量如果是元組或字典，那麽每個軸可能不同。如果是列表，則該列表的每個元素必須是 int、tuple 或 dict，為 args 中的相應數組定義深度。可以使用 (-/+) 元組的 dict 值指定非對稱深度。請注意，當前僅當 boundary 為 ‘none’ 時才支持非對稱深度。默認值為 0。
boundary: str, tuple, dict or list：: 如何處理邊界。值包括‘reflect’, ‘periodic’, ‘nearest’, ‘none’，或任何常量值，如 0 或 np.nan。如果是列表，則每個元素必須是 str、tuple 或 dict，為 args 中的相應數組定義邊界。默認值為‘reflect’。
trim: bool：: 調用 map 函數後是否從每個塊中修剪 depth 元素。如果您的映射函數已經為您執行此操作，請將其設置為 False
align_arrays: bool：: 當提供多個數組時，是否沿相同大小的維度對齊塊。這允許將某些數組中的較大塊分解為與其他數組中的塊大小匹配的較小塊，以便它們與塊函數映射兼容。如果這是錯誤的，那麽如果數組在每個維度中沒有相同數量的塊，則會引發錯誤。
**kwargs:: map_blocks 中有效的其他關鍵字參數

例子：

>>> import numpy as np
>>> import dask.array as da

>>> x = np.array([1, 1, 2, 3, 3, 3, 2, 1, 1])
>>> x = da.from_array(x, chunks=5)
>>> def derivative(x):
...     return x - np.roll(x, 1)

>>> y = x.map_overlap(derivative, depth=1, boundary=0)
>>> y.compute()
array([ 1,  0,  1,  1,  0,  0, -1, -1,  0])

>>> x = np.arange(16).reshape((4, 4))
>>> d = da.from_array(x, chunks=(2, 2))
>>> d.map_overlap(lambda x: x + x.size, depth=1, boundary='reflect').compute()
array([[16, 17, 18, 19],
       [20, 21, 22, 23],
       [24, 25, 26, 27],
       [28, 29, 30, 31]])

>>> func = lambda x: x + x.size
>>> depth = {0: 1, 1: 1}
>>> boundary = {0: 'reflect', 1: 'none'}
>>> d.map_overlap(func, depth, boundary).compute()  
array([[12,  13,  14,  15],
       [16,  17,  18,  19],
       [20,  21,  22,  23],
       [24,  25,  26,  27]])

da.map_overlap 函數也可以接受多個數組。

>>> func = lambda x, y: x + y
>>> x = da.arange(8).reshape(2, 4).rechunk((1, 2))
>>> y = da.arange(4).rechunk(2)
>>> da.map_overlap(func, x, y, depth=1, boundary='reflect').compute() 
array([[ 0,  2,  4,  6],
       [ 4,  6,  8,  10]])

當給定多個數組時，它們不需要具有相同的維數，但它們必須一起廣播。數組逐塊對齊(就像在 da.map_blocks 中一樣)，因此塊必須具有共同的塊大小。隻要align_arrays 為真，就會自動確定這種常見的分塊。

>>> x = da.arange(8, chunks=4)
>>> y = da.arange(8, chunks=2)
>>> r = da.map_overlap(func, x, y, depth=1, boundary='reflect', align_arrays=True)
>>> len(r.to_delayed())
4

>>> da.map_overlap(func, x, y, depth=1, boundary='reflect', align_arrays=False).compute()
Traceback (most recent call last):
    ...
ValueError: Shapes do not align {'.0': {2, 4}}

另請注意，默認情況下，此函數等效於map_blocks。必須為出現在提供給 func 的數組中的任何重疊定義一個非零的 depth。

>>> func = lambda x: x.sum()
>>> x = da.ones(10, dtype='int')
>>> block_args = dict(chunks=(), drop_axis=0)
>>> da.map_blocks(func, x, **block_args).compute()
10
>>> da.map_overlap(func, x, **block_args, boundary='reflect').compute()
10
>>> da.map_overlap(func, x, **block_args, depth=1, boundary='reflect').compute()
12

對於可能無法處理 0-d 數組的函數，還可以使用與預期結果類型匹配的空數組指定 meta。在下麵的示例中，在計算 meta 時，func 將導致 IndexError：

>>> x = np.arange(16).reshape((4, 4))
>>> d = da.from_array(x, chunks=(2, 2))
>>> y = d.map_overlap(lambda x: x + x[2], depth=1, boundary='reflect', meta=np.array(()))
>>> y
dask.array<_trim, shape=(4, 4), dtype=float64, chunksize=(2, 2), chunktype=numpy.ndarray>
>>> y.compute()
array([[ 4,  6,  8, 10],
       [ 8, 10, 12, 14],
       [20, 22, 24, 26],
       [24, 26, 28, 30]])

同樣，可以為 meta 指定一個非 NumPy 數組：

>>> import cupy  
>>> x = cupy.arange(16).reshape((4, 4))  
>>> d = da.from_array(x, chunks=(2, 2))  
>>> y = d.map_overlap(lambda x: x + x[2], depth=1, boundary='reflect', meta=cupy.array(()))  
>>> y  
dask.array<_trim, shape=(4, 4), dtype=float64, chunksize=(2, 2), chunktype=cupy.ndarray>
>>> y.compute()  
array([[ 4,  6,  8, 10],
       [ 8, 10, 12, 14],
       [20, 22, 24, 26],
       [24, 26, 28, 30]])

相關用法

注：本文由純淨天空篩選整理自dask.org大神的英文原創作品 dask.array.map_overlap。非經特殊聲明，原始代碼版權歸原作者所有，本譯文未經允許或授權，請勿轉載或複製。