NTV-pandas : A semantic, compact and reversible JSON-pandas converter
Project description
NTV-pandas : A semantic, compact and reversible JSON-pandas converter
Why a NTV-pandas converter ?
pandas provide JSON converter but three limitations are present:
- the JSON-pandas converter take into account a few data types,
- the JSON-pandas converter is not always reversible (conversion round trip)
- external dtype (e.g. TableSchema type) are not included
main features
The NTV-pandas converter uses the semantic NTV format to include a large set of data types in a JSON representation.
The converter integrates:
- all the pandas
dtype
and the data-type associated to a JSON representation, - an always reversible conversion,
- a full compatibility with TableSchema specification
NTV-pandas was developped initially in the json-NTV project
example
In the example below, a DataFrame with several data types is converted to JSON.
The DataFrame resulting from this JSON is identical to the initial DataFrame (reversibility).
With the existing JSON interface, this conversion is not possible.
data example
In [1]: from shapely.geometry import Point
from datetime import date
import pandas as pd
import ntv-pandas as npd
In [2]: data = {'index': [100, 200, 300, 400, 500, 600],
'dates::date': pd.Series([date(1964,1,1), date(1985,2,5), date(2022,1,21), date(1964,1,1), date(1985,2,5), date(2022,1,21)]),
'value': [10, 10, 20, 20, 30, 30],
'value32': pd.Series([12, 12, 22, 22, 32, 32], dtype='int32'),
'res': [10, 20, 30, 10, 20, 30],
'coord::point': pd.Series([Point(1,2), Point(3,4), Point(5,6), Point(7,8), Point(3,4), Point(5,6)]),
'names': pd.Series(['john', 'eric', 'judith', 'mila', 'hector', 'maria'], dtype='string'),
'unique': True }
In [3]: df = pd.DataFrame(data).set_index('index')
In [4]: df
Out[4]:
dates::date value value32 res coord::point names unique
index
100 1964-01-01 10 12 10 POINT (1 2) john True
200 1985-02-05 10 12 20 POINT (3 4) eric True
300 2022-01-21 20 22 30 POINT (5 6) judith True
400 1964-01-01 20 22 10 POINT (7 8) mila True
500 1985-02-05 30 32 20 POINT (3 4) hector True
600 2022-01-21 30 32 30 POINT (5 6) maria True
JSON representation
In [5]: df_to_json = npd.to_json(df)
pprint(df_to_json, compact=True, width=120)
Out[5]:
{':tab': {'coord::point': [[1.0, 2.0], [3.0, 4.0], [5.0, 6.0], [7.0, 8.0], [3.0, 4.0], [5.0, 6.0]],
'dates::date': ['1964-01-01', '1985-02-05', '2022-01-21', '1964-01-01', '1985-02-05', '2022-01-21'],
'index': [100, 200, 300, 400, 500, 600],
'names::string': ['john', 'eric', 'judith', 'mila', 'hector', 'maria'],
'res': [10, 20, 30, 10, 20, 30],
'unique': [True, True, True, True, True, True],
'value': [10, 10, 20, 20, 30, 30],
'value32::int32': [12, 12, 22, 22, 32, 32]}}
Reversibility
In [5]: df_from_json = npd.read_json(df_to_json)
print('df created from JSON is equal to initial df ? ', df_from_json.equals(df))
Out[5]: df created from JSON is equal to initial df ? True
installation
ntv-pandas
itself is a pure Python package maintained on ntv-pandas github repository.
It can be installed with pip
.
pip install ntv-pandas
dependency:
json-ntv
: support the NTV format,shapely
: for the location data,pandas
roadmap
- type extension : interval dtype and sparse format not yet included
- table schema : option equivalent to
orient=table
to develop - null JSON data : strategy to define
- multidimensional : extension of the NTV format for multidimensional data (e.g. Xarray)
- pandas type : support for Series or DataFrame which include pandas data
- data consistency : controls between NTVtype and NTVvalue
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file ntv_pandas-0.1.1.tar.gz
.
File metadata
- Download URL: ntv_pandas-0.1.1.tar.gz
- Upload date:
- Size: 12.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a2a45e5df87908f972ee5e71759175b11e61f610f7650a9f9b54c54c75196538 |
|
MD5 | 2f4ae83ce18729e144b440aeb8ea6eca |
|
BLAKE2b-256 | 252321690956acb220714b9923f9c26b9ea2d8e72a6670736a3810ef5be22271 |
File details
Details for the file ntv_pandas-0.1.1-py3-none-any.whl
.
File metadata
- Download URL: ntv_pandas-0.1.1-py3-none-any.whl
- Upload date:
- Size: 10.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9abd68b8e7be1fccbae8d30dfe0ee8aed9c418c0daf3240dcbccf1f340675d6e |
|
MD5 | 63526ce75ef48430554ffe7a1ec02f93 |
|
BLAKE2b-256 | 50ac8788eb735732e83adb3e749327fb76b9dbd781c01c140f7f26564c453a53 |