Skip to main content

Library to handle JSON-stat data in python using pandas DataFrames.

Project description

LAST RELEASE 1.0: update to 1.0 version is highly recommended, since it supports JSON-stat 2.0 and brings other improvements.

pyjstat

https://travis-ci.org/predicador37/pyjstat.svg?branch=master

pyjstat is a python library for JSON-stat formatted data manipulation which allows reading and writing JSON-stat [1] format with python,using the DataFrame structures provided by the widely accepted pandas library [2]. The JSON-stat format is a simple lightweight JSON format for data dissemination, currently in its 2.0 version. Pyjstat is inspired in rjstat [3], a library to read and write JSON-stat with R, by ajschumacher. Note that, like in the rjstat project, not all features are supported (i.e. not all metadata are converted). pyjstat is provided under the Apache License 2.0.

This library was first developed to work with Python 2.7. With some fixes (thanks to @andrekittredge), now it works with Python 3.4 too.

Support for JSON-stat 1.3 and 2.0 is provided. JSON-stat 1.3 methods are deprecated now and shouldn’t be used in the future, but backwards compatibility has been preserved.

Pyjstat 1.0 is aimed for simplicity. JSON-stat classes have been replicated (Dataset, Collection and Dimension) and provided with simple read() and write() methods. Funcionality covers common use cases as having a URL or dataframe as data sources.

Methods for retrieving the value of a particular cube cell are taken from the JSON-stat Javascript sample code. Thanks to @badosa for this.

Also, version 1.0 makes use of the requests package internally, which should make downloading of datasets easier.

Test coverage is 88% and Travis CI is used.

Finally, note that the new classes and methods are inspired by JSON-stat 2.0, and hence, won’t work with previous versions of JSON-stat. However, older methods are still available incorporating bug fixes and performance improvements.

Installation

pyjstat requires pandas package. For installation:

pip install pyjstat

Usage of version 1.0 and newer (with JSON-stat 2.0 support)

Dataset operations: read and write

Typical usage often looks like this:

from pyjstat import pyjstat

EXAMPLE_URL = 'http://json-stat.org/samples/galicia.json'

# read from json-stat
dataset = pyjstat.Dataset.read(EXAMPLE_URL)

# write to dataframe
df = dataset.write('dataframe')
print(df)

# read from dataframe
dataset_from_df = pyjstat.Dataset.read(df)

# write to json-stat
print(dataset_from_df.write())

Dataset operation: get_value

This operation mimics the Javascript example in the JSON-stat web page:

from pyjstat import pyjstat

EXAMPLE_URL = 'http://json-stat.org/samples/oecd.json'
query = [{'concept': 'UNR'}, {'area': 'US'}, {'year': '2010'}]

dataset = pyjstat.Dataset.read(EXAMPLE_URL)
print(dataset.get_value(query))

Collection operations: read and write

A collection can be parsed into a list of dataframes:

from pyjstat import pyjstat

EXAMPLE_URL = 'http://json-stat.org/samples/collection.json'

collection = pyjstat.Collection.read(EXAMPLE_URL)
df_list = collection.write('dataframe_list')
print(df_list)

Example with UK ONS API

In the following example, apikey parameter must be replaced by a real api key from ONS. This dataset corresponds to residence type by sex by age in London:

EXAMPLE_URL = 'http://web.ons.gov.uk/ons/api/data/dataset/DC1104EW.json?'\
              'context=Census&jsontype=json-stat&apikey=yourapikey&'\
              'geog=2011HTWARDH&diff=&totals=false&'\
              'dm/2011HTWARDH=E12000007'
dataset = pyjstat.Dataset.read(EXAMPLE_URL)
df = dataset.write('dataframe')
print(df)

More examples

More examples can be found in the examples directory, both for versions 1.3 and 2.0.

Usage of version 0.3.5 and older (with support for JSON-stat 1.3)

This syntax is deprecated and therefore not recommended anymore.

From JSON-stat to pandas DataFrame

Typical usage often looks like this:

from pyjstat import pyjstat
import requests
from collections import OrderedDict

EXAMPLE_URL = 'http://json-stat.org/samples/us-labor.json'

data = requests.get(EXAMPLE_URL)
results = pyjstat.from_json_stat(data.json(object_pairs_hook=OrderedDict))
print (results)

From pandas DataFrame to JSON-stat

The same data can be converted into JSON-stat, with some unavoidable metadata loss:

from pyjstat import pyjstat
import requests
from collections import OrderedDict
import json

EXAMPLE_URL = 'http://json-stat.org/samples/us-labor.json'

data = requests.get(EXAMPLE_URL)
results = pyjstat.from_json_stat(data.json(object_pairs_hook=OrderedDict))
print (results)
print (json.dumps(json.loads(pyjstat.to_json_stat(results))))

Changes

For a changes, fixes, improvements and new features reference, see CHANGES.txt.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyjstat-1.1.0.tar.gz (863.5 kB view details)

Uploaded Source

Built Distribution

pyjstat-1.1.0-py3-none-any.whl (18.4 kB view details)

Uploaded Python 3

File details

Details for the file pyjstat-1.1.0.tar.gz.

File metadata

  • Download URL: pyjstat-1.1.0.tar.gz
  • Upload date:
  • Size: 863.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.1.0 requests-toolbelt/0.9.1 tqdm/4.34.0 CPython/3.7.2

File hashes

Hashes for pyjstat-1.1.0.tar.gz
Algorithm Hash digest
SHA256 d9b4f84ed906b18e8b2a315aa480b01aa04abaa137459fa3797fd3b40b5d2f21
MD5 7b76048d550c116a74ae6e4f821f75e3
BLAKE2b-256 5961cdca1b3f2026b22da697b6c6386cd3759da505eb1e6bb2816e298fd3a956

See more details on using hashes here.

File details

Details for the file pyjstat-1.1.0-py3-none-any.whl.

File metadata

  • Download URL: pyjstat-1.1.0-py3-none-any.whl
  • Upload date:
  • Size: 18.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.1.0 requests-toolbelt/0.9.1 tqdm/4.34.0 CPython/3.7.2

File hashes

Hashes for pyjstat-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 0e9976698c3ebb067a4b28e89cec3cdf4d19afa5b10f8780b7cfa20f9e4ab72b
MD5 8675d3431b9ae92ac3ddca958eff46ba
BLAKE2b-256 50cfdcc5ca12a24bb0c9e28856e352ce3a812759dbb1c8ba0381eaa16d1e4fe4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page