Skip to main content
Join the official 2019 Python Developers SurveyStart the survey!

Tabulates structured data into a mergeable CSV format

Project description

OpenTabulate

OpenTabulate is a Python package designed to organize, tabulate, and process structured data. It currently aims to be a data processing framework for the Linkable Open Data Environment, an exploratory project by the Data Exploration and Integration Lab (DEIL) within the Center for Special Business Projects (CSBP) at Statistics Canada. OpenTabulate offers

  • automated data retrieval
  • a systematic way of organizing and retrieving data using sources files (inspired by OpenAddresses),
  • tabulation of data into a standardized CSV format that is suitable for merging and linkage,
  • various methods to process data, including address parsing, cleaning and reformatting.

OpenTabulate's API defines several classes and methods, such that when put together form a processing pipeline. This simplifies the processing procedure as a sequence of class method invocations. Moreover, this design allows for ease of addition, modification and removal of code.

Requirements

A basic setup of the data processing software will at least require

  • Python (version 3.5+)
  • requests, compatible with your verison of Python

Optional dependencies

To process sources with the full_addr key, an address parser is required. Below are the currently supported address parsers.

Installation

Be sure to have a Python package manager that can access the Python Package Index. For example, if you have pip, run

$ pip install opentabulate

After installing the package, initialize the OpenTabulate environment by running

$ opentab --initialize

which creates ~/.opentabulate and other subdirectories.

Documentation

Please see our GitHub wiki.

Issues

You can post questions, enhancement requests, and bugs in Issues.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for opentabulate, version 1.0.0b1
Filename, size File type Python version Upload date Hashes
Filename, size opentabulate-1.0.0b1-py3-none-any.whl (19.8 kB) File type Wheel Python version py3 Upload date Hashes View hashes
Filename, size opentabulate-1.0.0b1.tar.gz (19.3 kB) File type Source Python version None Upload date Hashes View hashes

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN SignalFx SignalFx Supporter DigiCert DigiCert EV certificate StatusPage StatusPage Status page