Generate Pandas data frames, load and extract data, based on JSON Table Schema descriptors.
Project description
Generate and load Pandas data frames based on JSON Table Schema descriptors.
Installation
$ pip install datapackage $ pip install jsontableschema-pandas
Quick start
You can easily load resources from a data package as Pandas data frames by simply using datapackage.push_datapackage function:
>>> import datapackage
>>> data_url = 'http://data.okfn.org/data/core/country-list/datapackage.json'
>>> storage = datapackage.push_datapackage(data_url, 'pandas')
>>> storage.tables
['data___data']
>>> type(storage['data___data'])
<class 'pandas.core.frame.DataFrame'>
>>> storage['data___data'].head()
Name Code
0 Afghanistan AF
1 Åland Islands AX
2 Albania AL
3 Algeria DZ
4 American Samoa AS
Also it is possible to pull your existing data frame into a data package:
>>> datapackage.pull_datapackage('/tmp/datapackage.json', 'country_list', 'pandas', tables={
... 'data': storage['data___data'],
... })
Storage
Tabular Storage
Package implements Tabular Storage interface.
We can get storage this way:
>>> from jsontableschema_pandas import Storage
>>> storage = Storage()
Storage works as a container for Pandas data frames. You can define new data frame inside storage using storage.create method:
>>> storage.create('data', {
... 'primaryKey': 'id',
... 'fields': [
... {'name': 'id', 'type': 'integer'},
... {'name': 'comment', 'type': 'string'},
... ]
... })
>>> storage.tables
['data']
>>> storage['data'].shape
(0, 0)
Use storage.write to populate data frame with data:
>>> storage.write('data', [(1, 'a'), (2, 'b')])
>>> storage['data']
comment
id
1 a
2 b
Also you can use tabulator to populate data frame from external data file:
>>> import tabulator
>>> with tabulator.topen('data/comments.csv', with_headers=True) as data:
... storage.write('data', data)
>>> storage['data']
comment
id
1 a
2 b
1 good
As you see, subsequent writes simply appends new data on top of existing ones.
Contributing
Please read the contribution guideline:
Thanks!
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for jsontableschema-pandas-0.1.3.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | cf6f29ea214723c852a24ba3a981e74729ac5bdbc479dd21d36dbb099cdd5416 |
|
MD5 | 9380a95d454502a97619952f1f038ff2 |
|
BLAKE2b-256 | 71619785d419475539770d2da4a0597f6b6bc902e7bcb2a93e463123cdf51a20 |
Hashes for jsontableschema_pandas-0.1.3-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7a8287f9114002e88a58a2ba783cfc1744c880f82f576dbb8b67f934214fb6c8 |
|
MD5 | 7dc2939af52fca1993e4edd2aa729e3b |
|
BLAKE2b-256 | 20be760e69cad4189cf0ed92a0394e765a742ac6ef1d219ca61778935b012a69 |