Provide fast JSON-like data transformation to and from nested parent-child and flat label-value data items, such as Pandas `Series` with MultiIndex index.
Project description
njson: efficient JSON-like data transformation tool
What is it?
Provide fast JSON-like data transformation to and from nested
parent-child and flat label-value data items, such as Pandas Series
with MultiIndex index.
Features
- Transformation are optimized for speed, to allow parsing deeply nested json-like data with millions of data points in seconds.
- Transformation are lazily computed properties only calculated when accessed for the first time. Initializing NestedJson object does not require parsing the source data.
- NestedJson is Pure-Python package that runs on JupyterLite, Pyodide, PyScript, Cloudflare Workers in Python, etc.
- Itegrates with pandas for quick data transformation to and from
pandas
SeriesandDataFrame. - Provides easy access to nested data at any level using parent keys.
- Provides easy way to pipe data manipulation methods for editing nested json-like data.
Example usage
Flatten and unflatten nested json-like or flat-dict-like data.
>>> import njson as nj
>>> d = {'a': 1, 'b': [{}, {'d': 2}]}
>>> njd = nj.NestedJson(d)
>>> print(njd.data) # source data
>>> print(njd.flat_dict) # data as flat-dict with parent key tuples
>>> print(njd.data_series) # with event length parent key tuples
>>> print(njd.data_series_bfill) # even length key tuples aligned rigth
>>> print(njd.get('b')) # get data
>>> print(njd.get('b', 0)) # get nested data
>>> print(njd.get('b', 1)) # get data at any nesting level
>>> njd.parsed.nfd.str # json str of `.data` parsed as nested flat-dict
{'a': 1, 'b': [{}, {'d': 2}]}
{('a',): 1, ('b', 0): {}, ('b', 1, 'd'): 2}
{('a', '', ''): 1, ('b', 0, ''): {}, ('b', 1, 'd'): 2}
{('', '', 'a'): 1, ('', 'b', 0): {}, ('b', 1, 'd'): 2}
[{}, {'d': 2}]
{}
{'d': 2}
'{"a": 1, "b": [{}, {"d": 2}]}'
Note
- Flat-dict keys are parent key tuples of nested json-like data values.
- List value parent key labels in flat-dict are integer values.
- Empty dict and list values are not "flattened", as they have no nested values or parent keys for nested values.
- The
.data_series- a flat-dict with even length key tuples - has similar data structure to pandasSerieswithMultiIndex. As such, key tuple length normalization prepares data for efficient creation of PandasSeriesobjects from deeply nested JSON object data.
Transforming Pandas Series, DataFrame to and from JSON-like data
>>> import pandas as pd
>>> import njson as nj
>>> d = {'a': 1, 'b': [{}, {'d': 2}]}
>>> njd = nj.NestedJson(d)
>>> ds = njd.to_data_series(into = pd.Series)
>>> print(ds) # pandas Series from NestedJson
>>> print(ds.unstack(level = [0])) # to DataFrame from Series
>>> print(ds.to_dict(into = NestedJson).parsed.nds.data)
<class 'pandas.core.series.Series'>
a 1
b 0 {}
1 d 2
dtype: object
d
a 1 NaN
b 0 {} NaN
1 NaN 2
a 1
b 0 {}
1 d 2
dtype: object
{'a': 1, 'b': [{}, {'d': 2}]}
Note
- Pass pandas
SeriestoNestedJson's.to_data_seriesmethod to directly derive PandasSeriesdata. Then access.parsed.nds.dataattribute to derive nested JSON-like data. - Pass
NestedJsonto pandasSeries.to_dict(into = NestedJson)to directly deriveNestedJsonfrom PandasSeriesdata. - Stacking and unstacking Pandas
Serieswith.unstack()and.stack()allows to tranform the nested JSON-like data to and from convenient tabular-like PandasDataFramedata structure. Note that (un)stacking by default sorts the level(s) in the resulting index or columns and therefore can alter the order of elements.
Manipulate nested json-like data.
>>> import njson as nj
>>> d = {'a': 1, 'b': [{}, {'d': 2}]}
>>> njd = nj.NestedJson(d)
>>> # source data
>>> print(njd.data)
>>> # add new nested data
>>> print(njd.setdefault([{'k1': 'v', 'k2': 'v', 'k3': 'v'}], 'c').data)
>>> # clear single key,value pair from nested dict
>>> print(njd.clear('c', 0, 'k1').data)
>>> # clear all values from nested dict
>>> print(njd.clear('c', 0).data)
>>> # remove existing sub-element
>>> print(njd.remove('b', 1).data)
>>> # add new or update existing sub-element or value
>>> print(njd.update({'d': 2}, 'b', 1).data)
>>> # replace existing sub-element or value
>>> print(njd.replace('replaced', 'b', 1).data)
>>> # if key does not exist, replace doesn't add new data
>>> print(njd.replace('notreplaced', 'b', 2).data)
>>> # add new list value value at start position index
>>> print(njd.setdefault('new list value at start', 'b', -3).data)
{'a': 1, 'b': [{}, {'d': 2}]}
{'a': 1, 'b': [{}, {'d': 2}], 'c': [{'k1': 'v', 'k2': 'v', 'k3': 'v'}]}
{'a': 1, 'b': [{}, {'d': 2}], 'c': [{'k2': 'v', 'k3': 'v'}]}
{'a': 1, 'b': [{}, {'d': 2}], 'c': [{}]}
{'a': 1, 'b': [{}], 'c': [{}]}
{'a': 1, 'b': [{}, {'d': 2}], 'c': [{}]}
{'a': 1, 'b': [{}, 'replaced'], 'c': [{}]}
{'a': 1, 'b': [{}, 'replaced'], 'c': [{}]}
{'a': 1, 'b': ['new list value at start', {}, 'replaced'], 'c': [{}]}
Note
- Data manipulation methods
.clear(),.remove(),.update(),.replace(), and.setdefault()returnNestedJsonobject. Such that it is possible to pipe mutiple data manpiluation operations.
Read JSON files
>>> import njson as nj
>>> njd_from_file = nj.read_json('path-to.json')
>>> njd_from_str = nj.read_json_str('{"a": 1, "b": [{}, {"d": 2}]}')
>>> njd_from_str.flat_dict
{('a',): 1, ('b', 0): {}, ('b', 1, 'd'): 2}
Changelog
v1.2
Features
- Add data manipulation methods
.clear(),.remove(),.update(),.replace(), and.setdefault().
Bug Fixes
- Do not flatten
tupleandsetobjects, to avoid non-reversable data transformations via implicit data converstions from tuples to lists.
v1.1
Features
- Add methods
.pipe_flat_dict(),.pipe_data_series(),.pipe_mapping()to pipe flat-dict, data-series, mapping, in addition to.pipe_data(). - Add optimzied
NestedJsonKeysView,NestedJsonValuesView,NestedJsonItemsViewfor.keys(),.values(), and.items()methods.
v1.0
Features
- Initial implementation of fast JSON-like data transformation to and from nested parent-child and flat label-value data items.
- Flat-dict keys are parent key tuples of nested json-like data values.
- List value parent key labels in flat-dict are integer values.
- Empty dict and list values are not "flattened", as they have no nested values or parent keys for nested values.
- The
.data_series()- a flat-dict with even length key tuples - has similar data structure to pandasSerieswithMultiIndex. As such, key tuple length normalization prepares data for efficient creation of PandasSeriesobjects from deeply nested JSON object data. - Pass pandas
SeriestoNestedJson's.to_data_seriesmethod to directly derive PandasSeriesdata. Then access.parsed.nds.dataattribute to derive nested JSON-like data. - Pass
NestedJsonto pandasSeries.to_dict(into = NestedJson)to directly deriveNestedJsonfrom PandasSeriesdata. - Stacking and unstacking Pandas
Serieswith.unstack()and.stack()allows to tranform the nested JSON-like data to and from convenient tabular-like PandasDataFramedata structure. Note that (un)stacking by default sorts the level(s) in the resulting index or columns and therefore can alter the order of elements.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file njson-1.2.0.tar.gz.
File metadata
- Download URL: njson-1.2.0.tar.gz
- Upload date:
- Size: 11.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.2 CPython/3.12.3 Linux/5.4.109+
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
58b00ae245204a58fcdd2505710712035eb16ae24bbe633dfe40f62630984aa0
|
|
| MD5 |
e4af2767f9384c91469fa1a3b93fa673
|
|
| BLAKE2b-256 |
73ce182b651c11cabfb80d4284d332f35d1aea0677d729aeda4060425d217de1
|
File details
Details for the file njson-1.2.0-py3-none-any.whl.
File metadata
- Download URL: njson-1.2.0-py3-none-any.whl
- Upload date:
- Size: 11.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.2 CPython/3.12.3 Linux/5.4.109+
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
de18efb70d1bcb183594aad294367bb00de0192cb6322e1ae6a495601a7be135
|
|
| MD5 |
4415f7d6ddadb2275e9bc911f1c14732
|
|
| BLAKE2b-256 |
543ba0262884de13a821afa0264a649a0a28303180f1222ae4b891f8349c9a8a
|