Skip to main content

Python framework for transforming tabulated data with visual relationships into tidy data

Project description

Tidychef

Tests 100% Test Coverage Static Badge

Tidychef is a python framework to enable “data extraction for humans” via simple python beginner friendly "recipes". It aims at allowing users to easily transform tabulated data sources that use visual relationships (human readable only data) into simple machine readable "tidy data" in a repeatable way.

i.e: it allows you to reliably turn something that looks like this:

into something that looks like this:

Note: image cropped for reasons of practicality.

Currently supported input formats are xls, xlsx, ods and csv. Though users can add additional formats relatively easily and without a codebase change being necessary.

Tidychef is designed to allow even novice python users or analysts to quickly become productive but also has an advanced feature set and is designed to be readily and easily extended (adding new source of tabulated data, your own use case specific methods and filters and domain specific validation etc are all possible and documented in detail).

In depth training material, examples and technical documentation can be found here.

Installation

pip install tidychef

Acknowledgements

Tidychef is directly inspired by the python package databaker created by The Sensible Code Company in partnership with the United Kingdoms Office For National Statistics.

While I liked databaker and successfully worked with it on multiple ETL projects over the course of almost a decade, this software should be considered the culmination of that work and the lessons learned from that time.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tidychef-0.1.3.tar.gz (53.9 kB view hashes)

Uploaded Source

Built Distribution

tidychef-0.1.3-py3-none-any.whl (85.1 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page