NTV-pandas : A semantic, compact and reversible JSON-pandas converter
Project description
NTV-pandas : A semantic, compact and reversible JSON-pandas converter
For more information, see the user guide or the github repository.
NTV-pandas is referenced in the pandas ecosystem.
Why a NTV-pandas converter ?
pandas provide JSON converter but three limitations are present:
- the JSON-pandas converter take into account few data types,
- the JSON-pandas converter is not always reversible (conversion round trip)
- external data types (e.g. TableSchema types) are not included
main features
The NTV-pandas converter uses the semantic NTV format to include a large set of data types in a JSON representation.
The converter integrates:
- all the pandas
dtype
and the data-type associated to a JSON representation, - an always reversible conversion,
- a full compatibility with Table Schema specification
NTV-pandas was developped originally in the json-NTV project
example
In the example below, a DataFrame with multiple data types is converted to JSON (first to NTV format and then to Table Schema format).
The DataFrame resulting from these JSON conversions are identical to the initial DataFrame (reversibility).
With the existing JSON interface, these conversions are not possible.
Data example:
In [1]: from shapely.geometry import Point
from datetime import date
import pandas as pd
import ntv_pandas as npd
In [2]: data = {'index': [100, 200, 300, 400, 500],
'dates::date': [date(1964,1,1), date(1985,2,5), date(2022,1,21), date(1964,1,1), date(1985,2,5)],
'value': [10, 10, 20, 20, 30],
'value32': pd.Series([12, 12, 22, 22, 32], dtype='int32'),
'res': [10, 20, 30, 10, 20],
'coord::point': [Point(1,2), Point(3,4), Point(5,6), Point(7,8), Point(3,4)],
'names': pd.Series(['john', 'eric', 'judith', 'mila', 'hector'], dtype='string'),
'unique': True }
In [3]: df = pd.DataFrame(data).set_index('index')
df.index.name = None
In [4]: df
Out[4]: dates::date value value32 res coord::point names unique
100 1964-01-01 10 12 10 POINT (1 2) john True
200 1985-02-05 10 12 20 POINT (3 4) eric True
300 2022-01-21 20 22 30 POINT (5 6) judith True
400 1964-01-01 20 22 10 POINT (7 8) mila True
500 1985-02-05 30 32 20 POINT (3 4) hector True
JSON-NTV representation:
In [5]: df_to_json = df.npd.to_json()
pprint(df_to_json, compact=True, width=120, sort_dicts=False)
Out[5]: {':tab': {'index': [100, 200, 300, 400, 500],
'dates::date': ['1964-01-01', '1985-02-05', '2022-01-21', '1964-01-01', '1985-02-05'],
'value': [10, 10, 20, 20, 30],
'value32::int32': [12, 12, 22, 22, 32],
'res': [10, 20, 30, 10, 20],
'coord::point': [[1.0, 2.0], [3.0, 4.0], [5.0, 6.0], [7.0, 8.0], [3.0, 4.0]],
'names::string': ['john', 'eric', 'judith', 'mila', 'hector'],
'unique': True}}
Reversibility:
In [6]: print(npd.read_json(df_to_json).equals(df))
Out[6]: True
Table Schema representation:
In [7]: df_to_table = df.npd.to_json(table=True)
pprint(df_to_table['data'][0], sort_dicts=False)
Out[7]: {'index': 100,
'dates': '1964-01-01',
'value': 10,
'value32': 12,
'res': 10,
'coord': [1.0, 2.0],
'names': 'john',
'unique': True}
In [8]: pprint(df_to_table['schema'], sort_dicts=False)
Out[8]: {'fields': [{'name': 'index', 'type': 'integer'},
{'name': 'dates', 'type': 'date'},
{'name': 'value', 'type': 'integer'},
{'name': 'value32', 'type': 'integer', 'format': 'int32'},
{'name': 'res', 'type': 'integer'},
{'name': 'coord', 'type': 'geopoint', 'format': 'array'},
{'name': 'names', 'type': 'string'},
{'name': 'unique', 'type': 'boolean'}],
'primaryKey': ['index'],
'pandas_version': '1.4.0'}
Reversibility:
In [9]: print(npd.read_json(df_to_table).equals(df))
Out[9]: True
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file ntv_pandas-1.0.2.tar.gz
.
File metadata
- Download URL: ntv_pandas-1.0.2.tar.gz
- Upload date:
- Size: 15.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 49320cc73da2018297a18ff3cded33670f8f89b625f818aec3aa03df8d360b52 |
|
MD5 | 858a110cc1033eebdecbd5f920cff5e4 |
|
BLAKE2b-256 | f2fbb917d2897e14960332a6c084b38706982cc4ada30a2c15d4805f756658f6 |
File details
Details for the file ntv_pandas-1.0.2-py3-none-any.whl
.
File metadata
- Download URL: ntv_pandas-1.0.2-py3-none-any.whl
- Upload date:
- Size: 12.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ab3cb5a2d0e1767d8cf95bafdf1978fde4e8c3942f68d553a593cc309c775496 |
|
MD5 | c662bb5862861de8345d770f754a7298 |
|
BLAKE2b-256 | 468e4555338832a07f264cca57faefe509e9a026f045df5cff4bf50dbe72ee50 |