Python pyspark DataFrame.truncate用法及代碼示例

本文簡要介紹 pyspark.pandas.DataFrame.truncate 的用法。

用法: DataFrame.truncate(before: Optional[Any] = None, after: Optional[Any] = None, axis: Union[int, str, None] = None, copy: bool = True) → Union[DataFrame, Series]

在某個索引值之前和之後截斷係列或DataFrame。

這是基於高於或低於某些閾值的索引值的布爾索引的有用簡寫。

注意

此 API 依賴於 Index.is_monotonic_increasing() ，這可能很昂貴。

參數：

before：日期、字符串、int: 截斷此索引值之前的所有行。
after：日期、字符串、int: 截斷此索引值之後的所有行。
axis：{0 或 ‘index’，1 或 ‘columns’}，可選: 要截斷的軸。默認情況下截斷索引(行)。
copy：布爾值，默認為 True，: 返回截斷部分的副本。

調用者類型: 截斷的 Series 或 DataFrame。

例子：

>>> df = ps.DataFrame({'A': ['a', 'b', 'c', 'd', 'e'],
...                    'B': ['f', 'g', 'h', 'i', 'j'],
...                    'C': ['k', 'l', 'm', 'n', 'o']},
...                   index=[1, 2, 3, 4, 5])
>>> df
   A  B  C
1  a  f  k
2  b  g  l
3  c  h  m
4  d  i  n
5  e  j  o

>>> df.truncate(before=2, after=4)
   A  B  C
2  b  g  l
3  c  h  m
4  d  i  n

DataFrame 的列可以被截斷。

>>> df.truncate(before="A", after="B", axis="columns")
   A  B
1  a  f
2  b  g
3  c  h
4  d  i
5  e  j

對於 Series，隻能截斷行。

>>> df['A'].truncate(before=2, after=4)
2    b
3    c
4    d
Name: A, dtype: object

Series 具有對整數進行排序的索引。

>>> s = ps.Series([10, 20, 30, 40, 50, 60, 70],
...               index=[1, 2, 3, 4, 5, 6, 7])
>>> s
1    10
2    20
3    30
4    40
5    50
6    60
7    70
dtype: int64

>>> s.truncate(2, 5)
2    20
3    30
4    40
5    50
dtype: int64

Series 具有對字符串進行排序的索引。

>>> s = ps.Series([10, 20, 30, 40, 50, 60, 70],
...               index=['a', 'b', 'c', 'd', 'e', 'f', 'g'])
>>> s
a    10
b    20
c    30
d    40
e    50
f    60
g    70
dtype: int64

>>> s.truncate('b', 'e')
b    20
c    30
d    40
e    50
dtype: int64

相關用法

注：本文由純淨天空篩選整理自spark.apache.org大神的英文原創作品 pyspark.pandas.DataFrame.truncate。非經特殊聲明，原始代碼版權歸原作者所有，本譯文未經允許或授權，請勿轉載或複製。

用法:

參數：

返回：

例子：