Tabulates structured data into a mergeable CSV format
OpenTabulate is a Python package designed to organize, tabulate, and process structured data. It currently aims to be a data processing framework for the Linkable Open Data Environment, an exploratory project by the Data Exploration and Integration Lab (DEIL) within the Center for Special Business Projects (CSBP) at Statistics Canada. OpenTabulate offers
- automated data retrieval
- a systematic way of organizing and retrieving data using sources files (inspired by OpenAddresses),
- tabulation of data into a standardized CSV format that is suitable for merging and linkage,
- various methods to process data, including address parsing, cleaning and reformatting.
OpenTabulate's API defines several classes and methods, such that when put together form a processing pipeline. This simplifies the processing procedure as a sequence of class method invocations. Moreover, this design allows for ease of addition, modification and removal of code.
A basic setup of the data processing software will at least require
To process sources with the
full_addr key, an address parser is required. Below are the currently supported address parsers.
Be sure to have a Python package manager that can access the Python Package Index. For example, if you have
$ pip install opentabulate
After installing the package, initialize the OpenTabulate environment by running
$ opentab --initialize
~/.opentabulate and other subdirectories.
Please see our GitHub wiki.
You can post questions, enhancement requests, and bugs in Issues.
|Filename, size||File type||Python version||Upload date||Hashes|
|Filename, size opentabulate-1.0.0b1-py3-none-any.whl (19.8 kB)||File type Wheel||Python version py3||Upload date||Hashes View hashes|
|Filename, size opentabulate-1.0.0b1.tar.gz (19.3 kB)||File type Source||Python version None||Upload date||Hashes View hashes|
Hashes for opentabulate-1.0.0b1-py3-none-any.whl