Skip to main content

Tabulates structured data into a mergeable CSV format

Project description

OpenTabulate

OpenTabulate is a Python package designed to organize, tabulate, and process structured data. It currently aims to be a data processing framework for the Linkable Open Data Environment, an exploratory project by the Data Exploration and Integration Lab (DEIL) within the Center for Special Business Projects (CSBP) at Statistics Canada. OpenTabulate offers

  • automated data retrieval
  • a systematic way of organizing and retrieving data using sources files (inspired by OpenAddresses),
  • tabulation of data into a standardized CSV format that is suitable for merging and linkage,
  • various methods to process data, including address parsing, cleaning and reformatting.

OpenTabulate's API defines several classes and methods, such that when put together form a processing pipeline. This simplifies the processing procedure as a sequence of class method invocations. Moreover, this design allows for ease of addition, modification and removal of code.

Requirements

A basic setup of the data processing software will at least require

  • Python (version 3.5+)
  • requests, compatible with your verison of Python

Optional dependencies

To process sources with the full_addr key, an address parser is required. Below are the currently supported address parsers.

Installation

Be sure to have a Python package manager that can access the Python Package Index. For example, if you have pip, run

$ pip install opentabulate

After installing the package, initialize the OpenTabulate environment by running

$ opentab --initialize

which creates ~/.opentabulate and other subdirectories.

Documentation

Please see our GitHub wiki.

Issues

You can post questions, enhancement requests, and bugs in Issues.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

opentabulate-1.0.0b1.tar.gz (19.3 kB view hashes)

Uploaded Source

Built Distribution

opentabulate-1.0.0b1-py3-none-any.whl (19.8 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page