Skip to main content

a build tool for data

Project description

make for your data.

An automation tool for data manipulation.

Inspired by Open Refine.

The general principles in Databuild are:

  • Low entry barrier

  • Easy to install

  • Easy to grasp

  • Extensible

Databuild can be useful for scenarios such as:

  • Documenting data transformations in your infoviz project

  • Automate data processing in a declarative way

Installation

Install databuild:

$ pip install databuild

Quickstart

For more details, see the Extended Documentation.

$ data-build.py buildfile.json

buildfile.yaml contains a list of operations to be performed on data. Think of it as a script for a spreadsheet.

An example of build file could be:

- operation: sheets.import_data
  description: Importing data from csv file
  params:
    sheet: dataset1
    format: csv
    filename: dataset1.csv
    skip_last_lines: 1
- operation: columns.add_column
  description: Calculate the gender ratio
  params:
    sheet: dataset1
    name: Gender Ratio
    expression:
      language: python
      content: "return float(row['Totale Maschi']) / float(row['Totale Femmine'])"
- operation: sheets.export_data
  description: save the data
  params:
    sheet: dataset1
    format: csv
    filename: dataset2.csv

JSON buildfiles are also supported. databuild will guess the type based on the extension.

License

Licensed under BSD 3-clauses.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

databuild-0.0.9.tar.gz (17.0 kB view details)

Uploaded Source

File details

Details for the file databuild-0.0.9.tar.gz.

File metadata

  • Download URL: databuild-0.0.9.tar.gz
  • Upload date:
  • Size: 17.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for databuild-0.0.9.tar.gz
Algorithm Hash digest
SHA256 4c9df194e5a7bebe2f9a9632956617b33f0cf1bd6abc268724ce62297b8d3d6b
MD5 6139c5f7205471e5865fdcec4c34ac13
BLAKE2b-256 b35e9e894410436a669c5c6155be88368eff3cee4237dab31eb7325ce9b40d48

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page