Skip to main content

NTV-pandas : A semantic, compact and reversible JSON-pandas converter

Project description

NTV-pandas : A semantic, compact and reversible JSON-pandas converter

json-NTV pandas

Why a NTV-pandas converter ?

pandas provide JSON converter but three limitations are present:

  • the JSON-pandas converter take into account a few data types,
  • the JSON-pandas converter is not always reversible (conversion round trip)
  • external dtype (e.g. TableSchema type) are not included

main features

The NTV-pandas converter uses the semantic NTV format to include a large set of data types in a JSON representation.

The converter integrates:

  • all the pandas dtype and the data-type associated to a JSON representation,
  • an always reversible conversion,
  • a full compatibility with TableSchema specification

example

In the example below, a DataFrame with several data types is converted to JSON.

The DataFrame resulting from this JSON is identical to the initial DataFrame (reversibility).

With the existing JSON interface, this conversion is not possible.

data example

In [1]: from shapely.geometry import Point
        from datetime import date
        import pandas as pd
        import ntv-pandas as npd

In [2]: data = {'index':           [100, 200, 300, 400, 500, 600],
                'dates::date':     pd.Series([date(1964,1,1), date(1985,2,5), date(2022,1,21), date(1964,1,1), date(1985,2,5), date(2022,1,21)]),
                'value':           [10, 10, 20, 20, 30, 30],
                'value32':         pd.Series([12, 12, 22, 22, 32, 32], dtype='int32'),
                'res':             [10, 20, 30, 10, 20, 30],
                'coord::point':    pd.Series([Point(1,2), Point(3,4), Point(5,6), Point(7,8), Point(3,4), Point(5,6)]),
                'names':           pd.Series(['john', 'eric', 'judith', 'mila', 'hector', 'maria'], dtype='string'),
                'unique':          True }

In [3]: df = pd.DataFrame(data).set_index('index')

In [4]: df
Out[4]:
              dates::date  value  value32  res coord::point   names  unique
        index
        100    1964-01-01     10       12   10  POINT (1 2)    john    True
        200    1985-02-05     10       12   20  POINT (3 4)    eric    True
        300    2022-01-21     20       22   30  POINT (5 6)  judith    True
        400    1964-01-01     20       22   10  POINT (7 8)    mila    True
        500    1985-02-05     30       32   20  POINT (3 4)  hector    True
        600    2022-01-21     30       32   30  POINT (5 6)   maria    True

JSON representation

In [5]: df_to_json = npd.to_json(df)
        pprint(df_to_json, compact=True, width=120)
Out[5]:
        {':tab': {'coord::point': [[1.0, 2.0], [3.0, 4.0], [5.0, 6.0], [7.0, 8.0], [3.0, 4.0], [5.0, 6.0]],
                  'dates::date': ['1964-01-01', '1985-02-05', '2022-01-21', '1964-01-01', '1985-02-05', '2022-01-21'],
                  'index': [100, 200, 300, 400, 500, 600],
                  'names::string': ['john', 'eric', 'judith', 'mila', 'hector', 'maria'],
                  'res': [10, 20, 30, 10, 20, 30],
                  'unique': [True, True, True, True, True, True],
                  'value': [10, 10, 20, 20, 30, 30],
                  'value32::int32': [12, 12, 22, 22, 32, 32]}}

Reversibility

In [5]: df_from_json = npd.read_json(df_to_json)
        print('df created from JSON is equal to initial df ? ', df_from_json.equals(df))

Out[5]: df created from JSON is equal to initial df ?  True

installation

ntv-pandas itself is a pure Python package. It can be installed with pip

pip install ntv-pandas

dependency:

  • json-ntv support the NTV format,
  • shapely for the location data,

roadmap

  • type extension : interval dtype and sparse format not yet included
  • table schema : option equivalent to orient=table to develop
  • null JSON data : strategy to define
  • multidimensional : extension of the NTV format for multidimensional data (e.g. Xarray)
  • pandas type : support for Series or DataFrame which include pandas data

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ntv_pandas-0.1.0.tar.gz (10.5 kB view details)

Uploaded Source

Built Distribution

ntv_pandas-0.1.0-py3-none-any.whl (9.6 kB view details)

Uploaded Python 3

File details

Details for the file ntv_pandas-0.1.0.tar.gz.

File metadata

  • Download URL: ntv_pandas-0.1.0.tar.gz
  • Upload date:
  • Size: 10.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.9

File hashes

Hashes for ntv_pandas-0.1.0.tar.gz
Algorithm Hash digest
SHA256 3dc285c19abdb9ea768aaaf152102d2482a4dca34394e9b15b641f7775ba98f5
MD5 be84e979a85f2c93dce3ccd42b1ce541
BLAKE2b-256 7dbe3d43bdf32d070332a23b16f6d57f60b9577369da7431de937b325975e371

See more details on using hashes here.

File details

Details for the file ntv_pandas-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: ntv_pandas-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 9.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.9

File hashes

Hashes for ntv_pandas-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 bdcae6b2453a9439633c7706b0c1726cdc9b047dbd2b9f83355becf0988d7afe
MD5 2f6674f0a59f7a96b6312026a24cb7c0
BLAKE2b-256 3d33ffb62a19e9af85d33d0dcdbe2901c2893f6df333dec7b5b4f7ae9467ea35

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page