Python cuml.tsa.ARIMA用法及代码示例

用法: class cuml.tsa.ARIMA(endog, *, order: Tuple[int, int, int] =(1, 1, 1), seasonal_order: Tuple[int, int, int, int] =(0, 0, 0, 0), exog=None, fit_intercept=True, simple_differencing=True, handle=None, verbose=False, output_type=None)

为 in- 和 out-of-sample 时间序列预测实现批处理 ARIMA 模型，并支持季节性 (SARIMA)

ARIMA 代表Auto-Regressive 综合移动平均线。看https://en.wikipedia.org/wiki/Autoregressive_integrated_moving_average

此类可以将 ARIMA(p,d,q) 或 ARIMA(p,d,q)(P,D,Q)_s 模型拟合到一批相同长度(或不同长度，使用缺失值填充的开始)。该实现旨在在使用大批量时间序列时提供最佳性能。

参数：

endog：数据帧或类似数组(设备或主机): 内生变量，假设每个时间序列都在列中。可接受的格式：cuDF DataFrame、cuDF 系列、NumPy ndarray、Numba 设备 ndarray、cuda 阵列接口兼容阵列(如 CuPy)。接受缺失值，用 NaN 表示。
order：元组[int, int, int]: 模型的 ARIMA 阶 (p, d, q)
seasonal_order: Tuple[int, int, int, int]：: 模型的季节性 ARIMA 阶数 (P, D, Q, s)
exog：数据帧或类似数组(设备或主机): 外生变量，假设每个时间序列都在列中，以便与同一批次成员关联的变量相邻(列数：n_exog * batch_size)可接受的格式：cuDF DataFrame、cuDF Series、NumPy ndarray、Numba 设备ndarray，cuda 数组接口兼容的数组，如 CuPy。不支持缺失值。
fit_intercept：bool 或 int(默认 = True): 是否在模型中包含恒定趋势 mu
simple_differencing: bool or int (default = True)：: 如果为 True，则数据在传递到卡尔曼滤波器之前会进行差分。如果为 False，则差分是状态空间模型的一部分。在某些情况下，可以忽略此设置：使用置信区间计算预测将强制其为 False ；与 CSS 方法拟合将强制其为 True。注意：预测始终针对原始序列，而 statsmodels 在 simple_differencing 为 True 时计算差异序列的预测。
handle：cuml.Handle: 指定 cuml.handle 保存用于此模型中计算的内部 CUDA 状态。最重要的是，这指定了将用于模型计算的 CUDA 流，因此用户可以通过在多个流中创建句柄在不同的流中同时运行不同的模型。如果为 None，则创建一个新的。
verbose：int 或布尔值，默认=False: 设置日志记录级别。它必须是 cuml.common.logger.level_* 之一。有关详细信息，请参阅详细级别。
output_type：{‘input’, ‘cudf’, ‘cupy’, ‘numpy’, ‘numba’}，默认=无: 用于控制估计器的结果和属性的输出类型的变量。如果为 None，它将继承在模块级别设置的输出类型 cuml.global_settings.output_type 。有关详细信息，请参阅输出数据类型配置。

注意：

Performance: 让 \(r=max(p+s*P, q+s*Q+1)\) 。大多数操作使用的设备内存是\(O(\mathtt{batch\_size}*\mathtt{n\_obs} + \mathtt{batch\_size}*r^2)\)。执行时间是 n_obs 和 batch_size 的线性函数(如果 batch_size 很大)，但随着 r 增长非常快。

性能针对非常大的批量(例如数千个系列)进行了优化。

参考：

此类深受 Python 库 statsmodels 的影响，尤其是 statsmodels.tsa.statespace.sarimax.SARIMAX 。见https://www.statsmodels.org/stable/statespace.html。

此外，以下书籍是有用的参考：“Time Series Analysis by State Space Methods”，J. Durbin，S.J.考夫曼，第 2 版(2012 年)。

例子：

import numpy as np
from cuml.tsa.arima import ARIMA

# Create seasonal data with a trend, a seasonal pattern and noise
n_obs = 100
np.random.seed(12)
x = np.linspace(0, 1, n_obs)
pattern = np.array([[0.05, 0.0], [0.07, 0.03],
                    [-0.03, 0.05], [0.02, 0.025]])
noise = np.random.normal(scale=0.01, size=(n_obs, 2))
y = (np.column_stack((0.5*x, -0.25*x)) + noise
    + np.tile(pattern, (25, 1)))

# Fit a seasonal ARIMA model
model = ARIMA(y,
              order=(0,1,1),
              seasonal_order=(0,1,1,4),
              fit_intercept=False)
model.fit()

# Forecast
fc = model.forecast(10)
print(fc)

输出：

[[ 0.55204599 -0.25681163]
[ 0.57430705 -0.2262438 ]
[ 0.48120315 -0.20583011]
[ 0.535594   -0.24060046]
[ 0.57207541 -0.26695497]
[ 0.59433647 -0.23638713]
[ 0.50123257 -0.21597344]
[ 0.55562342 -0.25074379]
[ 0.59210483 -0.27709831]
[ 0.61436589 -0.24653047]]

属性：

order：ARIMA订单: 模型的 ARIMA 顺序 (p, d, q, P, D, Q, s, k, n_exog)
d_y: device array：: 设备上的时间序列数据
n_obs: int：: 观察次数
batch_size: int：: 批次中的时间序列数
dtype: numpy.dtype：: 数据和参数的浮点类型
niter: numpy.ndarray：: 拟合后，包含每个时间序列收敛前的迭代次数。

相关用法

注：本文由纯净天空筛选整理自rapids.ai大神的英文原创作品 cuml.tsa.ARIMA。非经特殊声明，原始代码版权归原作者所有，本译文未经允许或授权，请勿转载或复制。