Python pyspark DataFrame.first_valid_index用法及代码示例

本文简要介绍 pyspark.pandas.DataFrame.first_valid_index 的用法。

用法: DataFrame.first_valid_index() → Union[int, float, bool, str, bytes, decimal.Decimal, datetime.date, datetime.datetime, None, Tuple[Union[int, float, bool, str, bytes, decimal.Decimal, datetime.date, datetime.datetime, None], …]]

检索第一个有效值的索引。

标量、元组或无

例子：

支持DataFrame

>>> psdf = ps.DataFrame({'a': [None, 2, 3, 2],
...                     'b': [None, 2.0, 3.0, 1.0],
...                     'c': [None, 200, 400, 200]},
...                     index=['Q', 'W', 'E', 'R'])
>>> psdf
     a    b      c
Q  NaN  NaN    NaN
W  2.0  2.0  200.0
E  3.0  3.0  400.0
R  2.0  1.0  200.0

>>> psdf.first_valid_index()
'W'

支持MultiIndex 列

>>> psdf.columns = pd.MultiIndex.from_tuples([('a', 'x'), ('b', 'y'), ('c', 'z')])
>>> psdf
     a    b      c
     x    y      z
Q  NaN  NaN    NaN
W  2.0  2.0  200.0
E  3.0  3.0  400.0
R  2.0  1.0  200.0

>>> psdf.first_valid_index()
'W'

支持系列。

>>> s = ps.Series([None, None, 3, 4, 5], index=[100, 200, 300, 400, 500])
>>> s
100    NaN
200    NaN
300    3.0
400    4.0
500    5.0
dtype: float64

>>> s.first_valid_index()
300

支持MultiIndex

>>> midx = pd.MultiIndex([['lama', 'cow', 'falcon'],
...                       ['speed', 'weight', 'length']],
...                      [[0, 0, 0, 1, 1, 1, 2, 2, 2],
...                       [0, 1, 2, 0, 1, 2, 0, 1, 2]])
>>> s = ps.Series([None, None, None, None, 250, 1.5, 320, 1, 0.3], index=midx)
>>> s
lama    speed       NaN
        weight      NaN
        length      NaN
cow     speed       NaN
        weight    250.0
        length      1.5
falcon  speed     320.0
        weight      1.0
        length      0.3
dtype: float64

>>> s.first_valid_index()
('cow', 'weight')

相关用法

注：本文由纯净天空筛选整理自spark.apache.org大神的英文原创作品 pyspark.pandas.DataFrame.first_valid_index。非经特殊声明，原始代码版权归原作者所有，本译文未经允许或授权，请勿转载或复制。

用法:

返回：

例子：