Python dask.array.map_overlap用法及代码示例

用法:
dask.array.map_overlap(func, *args, depth=None, boundary=None, trim=True, align_arrays=True, **kwargs)

在有一些重叠的数组块上映射一个函数

我们在数组的块之间共享相邻区域，映射一个函数，然后修剪掉相邻的条带。如果深度大于沿特定轴的任何块，则重新分块数组。

请注意，此函数会在计算前尝试自动确定输出数组类型，如果您希望该函数在对 0-d 数组进行操作时不会成功，请参阅map_blocks 中的meta 关键字参数。

参数：

func: function：: 应用于每个扩展块的函数。如果提供了多个数组，那么函数应该期望以相同的顺序接收每个数组的块。
args：暗阵列
depth: int, tuple, dict or list：: 每个块应与其邻居共享的元素数量如果是元组或字典，那么每个轴可能不同。如果是列表，则该列表的每个元素必须是 int、tuple 或 dict，为 args 中的相应数组定义深度。可以使用 (-/+) 元组的 dict 值指定非对称深度。请注意，当前仅当 boundary 为 ‘none’ 时才支持非对称深度。默认值为 0。
boundary: str, tuple, dict or list：: 如何处理边界。值包括‘reflect’, ‘periodic’, ‘nearest’, ‘none’，或任何常量值，如 0 或 np.nan。如果是列表，则每个元素必须是 str、tuple 或 dict，为 args 中的相应数组定义边界。默认值为‘reflect’。
trim: bool：: 调用 map 函数后是否从每个块中修剪 depth 元素。如果您的映射函数已经为您执行此操作，请将其设置为 False
align_arrays: bool：: 当提供多个数组时，是否沿相同大小的维度对齐块。这允许将某些数组中的较大块分解为与其他数组中的块大小匹配的较小块，以便它们与块函数映射兼容。如果这是错误的，那么如果数组在每个维度中没有相同数量的块，则会引发错误。
**kwargs:: map_blocks 中有效的其他关键字参数

例子：

>>> import numpy as np
>>> import dask.array as da

>>> x = np.array([1, 1, 2, 3, 3, 3, 2, 1, 1])
>>> x = da.from_array(x, chunks=5)
>>> def derivative(x):
...     return x - np.roll(x, 1)

>>> y = x.map_overlap(derivative, depth=1, boundary=0)
>>> y.compute()
array([ 1,  0,  1,  1,  0,  0, -1, -1,  0])

>>> x = np.arange(16).reshape((4, 4))
>>> d = da.from_array(x, chunks=(2, 2))
>>> d.map_overlap(lambda x: x + x.size, depth=1, boundary='reflect').compute()
array([[16, 17, 18, 19],
       [20, 21, 22, 23],
       [24, 25, 26, 27],
       [28, 29, 30, 31]])

>>> func = lambda x: x + x.size
>>> depth = {0: 1, 1: 1}
>>> boundary = {0: 'reflect', 1: 'none'}
>>> d.map_overlap(func, depth, boundary).compute()  
array([[12,  13,  14,  15],
       [16,  17,  18,  19],
       [20,  21,  22,  23],
       [24,  25,  26,  27]])

da.map_overlap 函数也可以接受多个数组。

>>> func = lambda x, y: x + y
>>> x = da.arange(8).reshape(2, 4).rechunk((1, 2))
>>> y = da.arange(4).rechunk(2)
>>> da.map_overlap(func, x, y, depth=1, boundary='reflect').compute() 
array([[ 0,  2,  4,  6],
       [ 4,  6,  8,  10]])

当给定多个数组时，它们不需要具有相同的维数，但它们必须一起广播。数组逐块对齐(就像在 da.map_blocks 中一样)，因此块必须具有共同的块大小。只要align_arrays 为真，就会自动确定这种常见的分块。

>>> x = da.arange(8, chunks=4)
>>> y = da.arange(8, chunks=2)
>>> r = da.map_overlap(func, x, y, depth=1, boundary='reflect', align_arrays=True)
>>> len(r.to_delayed())
4

>>> da.map_overlap(func, x, y, depth=1, boundary='reflect', align_arrays=False).compute()
Traceback (most recent call last):
    ...
ValueError: Shapes do not align {'.0': {2, 4}}

另请注意，默认情况下，此函数等效于map_blocks。必须为出现在提供给 func 的数组中的任何重叠定义一个非零的 depth。

>>> func = lambda x: x.sum()
>>> x = da.ones(10, dtype='int')
>>> block_args = dict(chunks=(), drop_axis=0)
>>> da.map_blocks(func, x, **block_args).compute()
10
>>> da.map_overlap(func, x, **block_args, boundary='reflect').compute()
10
>>> da.map_overlap(func, x, **block_args, depth=1, boundary='reflect').compute()
12

对于可能无法处理 0-d 数组的函数，还可以使用与预期结果类型匹配的空数组指定 meta。在下面的示例中，在计算 meta 时，func 将导致 IndexError：

>>> x = np.arange(16).reshape((4, 4))
>>> d = da.from_array(x, chunks=(2, 2))
>>> y = d.map_overlap(lambda x: x + x[2], depth=1, boundary='reflect', meta=np.array(()))
>>> y
dask.array<_trim, shape=(4, 4), dtype=float64, chunksize=(2, 2), chunktype=numpy.ndarray>
>>> y.compute()
array([[ 4,  6,  8, 10],
       [ 8, 10, 12, 14],
       [20, 22, 24, 26],
       [24, 26, 28, 30]])

同样，可以为 meta 指定一个非 NumPy 数组：

>>> import cupy  
>>> x = cupy.arange(16).reshape((4, 4))  
>>> d = da.from_array(x, chunks=(2, 2))  
>>> y = d.map_overlap(lambda x: x + x[2], depth=1, boundary='reflect', meta=cupy.array(()))  
>>> y  
dask.array<_trim, shape=(4, 4), dtype=float64, chunksize=(2, 2), chunktype=cupy.ndarray>
>>> y.compute()  
array([[ 4,  6,  8, 10],
       [ 8, 10, 12, 14],
       [20, 22, 24, 26],
       [24, 26, 28, 30]])

相关用法

注：本文由纯净天空筛选整理自dask.org大神的英文原创作品 dask.array.map_overlap。非经特殊声明，原始代码版权归原作者所有，本译文未经允许或授权，请勿转载或复制。