Python pyspark Series.update用法及代碼示例

本文簡要介紹 pyspark.pandas.Series.update 的用法。

用法: Series.update(other: pyspark.pandas.series.Series) → None

使用傳遞的係列中的非 NA 值修改係列。在索引上對齊。

參數：

other：Series

例子：

>>> from pyspark.pandas.config import set_option, reset_option
>>> set_option("compute.ops_on_diff_frames", True)
>>> s = ps.Series([1, 2, 3])
>>> s.update(ps.Series([4, 5, 6]))
>>> s.sort_index()
0    4
1    5
2    6
dtype: int64

>>> s = ps.Series(['a', 'b', 'c'])
>>> s.update(ps.Series(['d', 'e'], index=[0, 2]))
>>> s.sort_index()
0    d
1    b
2    e
dtype: object

>>> s = ps.Series([1, 2, 3])
>>> s.update(ps.Series([4, 5, 6, 7, 8]))
>>> s.sort_index()
0    4
1    5
2    6
dtype: int64

>>> s = ps.Series([1, 2, 3], index=[10, 11, 12])
>>> s
10    1
11    2
12    3
dtype: int64

>>> s.update(ps.Series([4, 5, 6]))
>>> s.sort_index()
10    1
11    2
12    3
dtype: int64

>>> s.update(ps.Series([4, 5, 6], index=[11, 12, 13]))
>>> s.sort_index()
10    1
11    4
12    5
dtype: int64

如果other包含NaNs，則原始係列中的相應值不會更新。

>>> s = ps.Series([1, 2, 3])
>>> s.update(ps.Series([4, np.nan, 6]))
>>> s.sort_index()
0    4.0
1    2.0
2    6.0
dtype: float64

>>> reset_option("compute.ops_on_diff_frames")

相關用法

注：本文由純淨天空篩選整理自spark.apache.org大神的英文原創作品 pyspark.pandas.Series.update。非經特殊聲明，原始代碼版權歸原作者所有，本譯文未經允許或授權，請勿轉載或複製。