當前位置: 首頁>>編程示例 >>用法及示例精選 >>正文


Python pandas.json_normalize用法及代碼示例

用法:

pandas.json_normalize(data, record_path=None, meta=None, meta_prefix=None, record_prefix=None, errors='raise', sep='.', max_level=None)

將 semi-structured JSON 數據標準化為平麵表。

參數

data字典或字典列表

未序列化的 JSON 對象。

record_pathstr 或 str 列表,默認無

每個對象中到記錄列表的路徑。如果未通過,數據將被假定為記錄數組。

meta路徑列表(str 或 str 列表),默認無

用作結果表中每條記錄的元數據的字段。

meta_prefixstr,默認無

如果為 True,則在記錄前加上點 (?) 路徑,例如foo.bar.field 如果 meta 是 [‘foo’, ‘bar’]。

record_prefixstr,默認無

如果為 True,則在記錄前加上點 (?) 路徑,例如foo.bar.field 如果記錄的路徑是 [‘foo’, ‘bar’]。

errors{‘raise’, ‘ignore’},默認 ‘raise’

配置錯誤處理。

  • ‘ignore’:如果 meta 中列出的鍵並不總是存在,將忽略 KeyError。

  • ‘raise’:如果 meta 中列出的鍵並不總是存在,將引發 KeyError。

sepstr,默認“。”

嵌套記錄將生成以 sep 分隔的名稱。例如,對於 sep='.',{‘foo’:{‘bar’:0}} -> foo.bar。

max_level整數,默認無

要標準化的最大級別數(字典深度)。如果沒有,則標準化所有級別。

返回

frame DataFrame
將 semi-structured JSON 數據標準化為平麵表。

例子

>>> data = [
...     {"id":1, "name":{"first":"Coleen", "last":"Volk"}},
...     {"name":{"given":"Mark", "family":"Regner"}},
...     {"id":2, "name":"Faye Raker"},
... ]
>>> pd.json_normalize(data)
    id name.first name.last name.given name.family        name
0  1.0     Coleen      Volk        NaN         NaN         NaN
1  NaN        NaN       NaN       Mark      Regner         NaN
2  2.0        NaN       NaN        NaN         NaN  Faye Raker
>>> data = [
...     {
...         "id":1,
...         "name":"Cole Volk",
...         "fitness":{"height":130, "weight":60},
...     },
...     {"name":"Mark Reg", "fitness":{"height":130, "weight":60}},
...     {
...         "id":2,
...         "name":"Faye Raker",
...         "fitness":{"height":130, "weight":60},
...     },
... ]
>>> pd.json_normalize(data, max_level=0)
    id        name                        fitness
0  1.0   Cole Volk  {'height':130, 'weight':60}
1  NaN    Mark Reg  {'height':130, 'weight':60}
2  2.0  Faye Raker  {'height':130, 'weight':60}

將嵌套數據規範化到級別 1。

>>> data = [
...     {
...         "id":1,
...         "name":"Cole Volk",
...         "fitness":{"height":130, "weight":60},
...     },
...     {"name":"Mark Reg", "fitness":{"height":130, "weight":60}},
...     {
...         "id":2,
...         "name":"Faye Raker",
...         "fitness":{"height":130, "weight":60},
...     },
... ]
>>> pd.json_normalize(data, max_level=1)
    id        name  fitness.height  fitness.weight
0  1.0   Cole Volk             130              60
1  NaN    Mark Reg             130              60
2  2.0  Faye Raker             130              60
>>> data = [
...     {
...         "state":"Florida",
...         "shortname":"FL",
...         "info":{"governor":"Rick Scott"},
...         "counties":[
...             {"name":"Dade", "population":12345},
...             {"name":"Broward", "population":40000},
...             {"name":"Palm Beach", "population":60000},
...         ],
...     },
...     {
...         "state":"Ohio",
...         "shortname":"OH",
...         "info":{"governor":"John Kasich"},
...         "counties":[
...             {"name":"Summit", "population":1234},
...             {"name":"Cuyahoga", "population":1337},
...         ],
...     },
... ]
>>> result = pd.json_normalize(
...     data, "counties", ["state", "shortname", ["info", "governor"]]
... )
>>> result
         name  population    state shortname info.governor
0        Dade       12345   Florida    FL    Rick Scott
1     Broward       40000   Florida    FL    Rick Scott
2  Palm Beach       60000   Florida    FL    Rick Scott
3      Summit        1234   Ohio       OH    John Kasich
4    Cuyahoga        1337   Ohio       OH    John Kasich
>>> data = {"A":[1, 2]}
>>> pd.json_normalize(data, "A", record_prefix="Prefix.")
    Prefix.0
0          1
1          2

返回以給定字符串為前綴的列的規範化數據。

相關用法


注:本文由純淨天空篩選整理自pandas.pydata.org大神的英文原創作品 pandas.json_normalize。非經特殊聲明,原始代碼版權歸原作者所有,本譯文未經允許或授權,請勿轉載或複製。