Python pandas.DataFrame.info用法及代碼示例

用法: DataFrame.info(verbose=None, buf=None, max_cols=None, memory_usage=None, show_counts=None, null_counts=None)

打印 DataFrame 的簡明摘要。

此方法打印有關 DataFrame 的信息，包括索引 dtype 和列、非空值和內存使用情況。

參數：

data： DataFrame

DataFrame 打印有關的信息。

verbose：布爾型，可選

是否打印完整的摘要。默認情況下，遵循pandas.options.display.max_info_columns 中的設置。

buf：可寫緩衝區，默認為 sys.stdout

將輸出發送到哪裏。默認情況下，輸出打印到 sys.stdout。如果您需要進一步處理輸出，請傳遞一個可寫緩衝區。 max_cols:int, optional 何時從詳細輸出切換到截斷輸出。如果 DataFrame 的列超過 max_cols 列，則使用截斷的輸出。默認情況下，使用pandas.options.display.max_info_columns 中的設置。

memory_usage：布爾，str，可選

指定是否應顯示 DataFrame 元素(包括索引)的總內存使用情況。默認情況下，這遵循 pandas.options.display.memory_usage 設置。

True 總是顯示內存使用情況。 False 從不顯示內存使用情況。 ‘deep’ 的值相當於“True with deep introspection”。內存使用以人類可讀的單位(base-2 表示)顯示。在沒有深入自省的情況下，基於列 dtype 和行數進行內存估計，假設值消耗相應 dtype 的相同內存量。使用深度內存自省，以計算資源為代價執行實際內存使用計算。

show_counts：布爾型，可選

是否顯示非空計數。默認情況下，僅當 DataFrame 小於 pandas.options.display.max_info_rows 和 pandas.options.display.max_info_columns 時才會顯示。 True 值始終顯示計數，而 False 從不顯示計數。

null_counts：布爾型，可選

None: 此方法打印 DataFrame 的摘要並返回 None。

例子：

>>> int_values = [1, 2, 3, 4, 5]
>>> text_values = ['alpha', 'beta', 'gamma', 'delta', 'epsilon']
>>> float_values = [0.0, 0.25, 0.5, 0.75, 1.0]
>>> df = pd.DataFrame({"int_col":int_values, "text_col":text_values,
...                   "float_col":float_values})
>>> df
    int_col text_col  float_col
0        1    alpha       0.00
1        2     beta       0.25
2        3    gamma       0.50
3        4    delta       0.75
4        5  epsilon       1.00

打印所有列的信息：

>>> df.info(verbose=True)
<class 'pandas.core.frame.DataFrame'>
RangeIndex:5 entries, 0 to 4
Data columns (total 3 columns):
 #   Column     Non-Null Count  Dtype
---  ------     --------------  -----
 0   int_col    5 non-null      int64
 1   text_col   5 non-null      object
 2   float_col  5 non-null      float64
dtypes:float64(1), int64(1), object(1)
memory usage:248.0+ bytes

打印列數及其 dtypes 的摘要，但不打印每列信息：

>>> df.info(verbose=False)
<class 'pandas.core.frame.DataFrame'>
RangeIndex:5 entries, 0 to 4
Columns:3 entries, int_col to float_col
dtypes:float64(1), int64(1), object(1)
memory usage:248.0+ bytes

DataFrame.info 的管道輸出到緩衝區而不是 sys.stdout，獲取緩衝區內容並寫入文本文件：

>>> import io
>>> buffer = io.StringIO()
>>> df.info(buf=buffer)
>>> s = buffer.getvalue()
>>> with open("df_info.txt", "w",
...           encoding="utf-8") as f: 
...     f.write(s)
260

memory_usage 參數允許深度自省模式，特別適用於大數據幀和fine-tune 內存優化：

>>> random_strings_array = np.random.choice(['a', 'b', 'c'], 10 ** 6)
>>> df = pd.DataFrame({
...     'column_1':np.random.choice(['a', 'b', 'c'], 10 ** 6),
...     'column_2':np.random.choice(['a', 'b', 'c'], 10 ** 6),
...     'column_3':np.random.choice(['a', 'b', 'c'], 10 ** 6)
... })
>>> df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex:1000000 entries, 0 to 999999
Data columns (total 3 columns):
 #   Column    Non-Null Count    Dtype
---  ------    --------------    -----
 0   column_1  1000000 non-null  object
 1   column_2  1000000 non-null  object
 2   column_3  1000000 non-null  object
dtypes:object(3)
memory usage:22.9+ MB

>>> df.info(memory_usage='deep')
<class 'pandas.core.frame.DataFrame'>
RangeIndex:1000000 entries, 0 to 999999
Data columns (total 3 columns):
 #   Column    Non-Null Count    Dtype
---  ------    --------------    -----
 0   column_1  1000000 non-null  object
 1   column_2  1000000 non-null  object
 2   column_3  1000000 non-null  object
dtypes:object(3)
memory usage:165.9 MB

相關用法

注：本文由純淨天空篩選整理自pandas.pydata.org大神的英文原創作品 pandas.DataFrame.info。非經特殊聲明，原始代碼版權歸原作者所有，本譯文未經允許或授權，請勿轉載或複製。

用法:

參數：

返回：

例子：