Python pyspark Series.align用法及代碼示例

本文簡要介紹 pyspark.pandas.Series.align 的用法。

用法: Series.align(other: Union[pyspark.pandas.frame.DataFrame, Series], join: str = 'outer', axis: Union[int, str, None] = None, copy: bool = True) → Tuple[pyspark.pandas.series.Series, Union[pyspark.pandas.frame.DataFrame, pyspark.pandas.series.Series]]

使用指定的連接方法將兩個對象在其軸上對齊。

為每個軸索引指定連接方法。

參數：

other：DataFrame 或係列
join：{{‘outer’, ‘inner’, ‘left’, ‘right’}}，默認 ‘outer’
axis：其他對象的允許軸，默認無: 對齊索引 (0)、列 (1) 或兩者(無)。
copy：布爾值，默認為真: 總是返回新對象。如果 copy=False 並且不需要重新索引，則返回原始對象。

(left, right)：(係列，其他類型): 對齊的對象。

例子：

>>> ps.set_option("compute.ops_on_diff_frames", True)
>>> s1 = ps.Series([7, 8, 9], index=[10, 11, 12])
>>> s2 = ps.Series(["g", "h", "i"], index=[10, 20, 30])

>>> aligned_l, aligned_r = s1.align(s2)
>>> aligned_l.sort_index()
10    7.0
11    8.0
12    9.0
20    NaN
30    NaN
dtype: float64
>>> aligned_r.sort_index()
10       g
11    None
12    None
20       h
30       i
dtype: object

與連接類型 “inner” 對齊：

>>> aligned_l, aligned_r = s1.align(s2, join="inner")
>>> aligned_l.sort_index()
10    7
dtype: int64
>>> aligned_r.sort_index()
10    g
dtype: object

與 DataFrame 對齊：

>>> df = ps.DataFrame({"a": [1, 2, 3], "b": ["a", "b", "c"]}, index=[10, 20, 30])
>>> aligned_l, aligned_r = s1.align(df)
>>> aligned_l.sort_index()
10    7.0
11    8.0
12    9.0
20    NaN
30    NaN
dtype: float64
>>> aligned_r.sort_index()
      a     b
10  1.0     a
11  NaN  None
12  NaN  None
20  2.0     b
30  3.0     c

>>> ps.reset_option("compute.ops_on_diff_frames")

相關用法

注：本文由純淨天空篩選整理自spark.apache.org大神的英文原創作品 pyspark.pandas.Series.align。非經特殊聲明，原始代碼版權歸原作者所有，本譯文未經允許或授權，請勿轉載或複製。

用法:

參數：

返回：

例子：