A Python- and pandas-powered client for Statistical Data and Metadata eXchange
Project description
pandaSDMX is an Apache 2.0-licensed Python package aimed at becoming the most intuitive and versatile tool to retrieve and acquire statistical data and metadata disseminated in SDMX format. It should work with all SDMX data providers supporting SDMX 2.1. Currently, this is tested for the European statistics office (Eurostat), and the European Central Bank (ECB) each providing hundreds of thousands of time series.
While pandaSDMX is extensible to cater any output format, it currently supports only pandas, the gold-standard of data analysis in Python. But from pandas you can export your data to Excel and friends.
Main features
intuitive API inspired by requests
support for many SDMX features including
generic datasets
data structure definitions, code lists and concept schemes
dataflow definitions and content-constraints
categorisations and category schemes
pythonic representation of the SDMX information model
find dataflows by name or description in multiple languages if available
When requesting datasets, validate column selections against code lists and content-constraints if available
read and write SDMX messages to and from local files
configurable HTTP connections
support for requests-cache allowing to cache SDMX messages in memory, MongoDB, Redis or SQLite
writer transforming SDMX generic datasets into multi-indexed pandas DataFrames or Series of observations and attributes
extensible through custom readers and writers for alternative input and output formats of data and metadata
For further details including extensive code examples see the documentation .
pandaSDMX Links
Recent changes
v0.3.1 (2015-10-04)
This release fixes a few bugs which caused crashes in some situations.
v0.3.0 (2015-09-22)
support for requests-cache allowing to cache SDMX messages in memory, MongoDB, Redis or SQLite
pythonic selection of series when requesting a dataset: Request.get allows the key keyword argument in a data request to be a dict mapping dimension names to values. In this case, the dataflow definition and datastructure definition, and content-constraint are downloaded on the fly, cached in memory and used to validate the keys. The dotted key string needed to construct the URL will be generated automatically.
The Response.write method takes a parse_time keyword arg. Set it to False to avoid parsing of dates, times and time periods as exotic formats may cause crashes.
The Request.get method takes a memcache keyward argument. If set to a string, the received Response instance will be stored in the dict Request.cache for later use. This is useful when, e.g., a DSD is needed multiple times to validate keys.
fixed base URL for Eurostat
major refactorings to enhance code maintainability
v0.2.2 (2015-05-19)
Make HTTP connections configurable by exposing the requests.get API through the pandasdmx.api.Request constructor. Hence, proxy servers, authentication information and other HTTP-related parameters consumed by requests.get can be set for an Request instance and used in subsequent requests. The configuration is exposed as a dict through the Request.client.config attribute.
Responses now have an http_headers attribute containing the headers returned by the SDMX server
v0.2.1 (2015-04-22)
API: add support for zip archives received from an SDMX server. This is common for large datasets from Eurostat
incidentally get a remote resource if the footer of a received message specifies an URL. This pattern is common for large datasets from Eurostat.
allow passing a file-like object to api.Request.get()
enhance documentation
make pandas writer parse more time period formats and increase its performance
v0.2.0 (2015-04-13)
This version is a quantum leap. The whole project has been redesigned and rewritten from scratch to provide robust support for many SDMX features. The new architecture is centered around a pythonic representation of the SDMX information model. It is extensible through readers and writers for alternative input and output formats. Export to pandas has been dramatically improved. Sphinx documentation has been added.
v0.1 (2014-09)
Initial release
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for pandaSDMX-0.4.0-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0da7acd564d04123b8dc5930890a50649d06430c3fbb38faf6cfcc6bc8315953 |
|
MD5 | 5e72537c7c153fdd0d12d16ca27e6f7f |
|
BLAKE2b-256 | 6761d3d62ac41e3ee2470c3be2c6e3a7607995d498e83c57c33ccf32810d8605 |
Hashes for pandaSDMX-0.3.1-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d6b83d2deb40a345b56ff09f99c4cc566fc36c8fe35a0716f2d0f2d593175e38 |
|
MD5 | b18c464ce0c6794a5017195f3ab44696 |
|
BLAKE2b-256 | bff2b2af006ac9439977ea1a1f0cdcf58579f3099511b6980cab3ef939148106 |