Skip to main content

Playbooks for data. Open, process and save table based data.

Project description

Data Playbook

:book: Playbooks for data. Open, process and save table based data.

Automate repetitive tasks on table based data. Include various input and output tasks. Can be extended with custom modules.

Install: pip install dataplaybook

Use: dataplaybook playbook.yaml

Playbook structure

The playbook.yaml file allows you to load additional modules (containing tasks) and specify the tasks to execute in sequence, with all their parameters.

The tasks to perform typically follow the the structure of read, process, write.

Example yaml: (please note yaml is case sensitive)

modules: [list, of, modules]

tasks:
  - task: *name
    tables: # List of tables used by this task
    target: # Target table name of this function
    debug*: True/False # default: False
    # task specific properties, refer to each task

Tasks

Tasks are implemented as simple Python functions and the modules can be found in the dataplaybook/tasks folder.

Default tasks

  • build_lookup
  • combine
  • drop
  • extend
  • filter
  • fuzzy_match (pip install fuzzywuzzy)
  • print
  • replace
  • unique
  • vlookup

Module io_xlsx (loaded by default)

  • read_excel
  • write_excel

Module io_misc (loaded by default)

  • read_csv
  • read_tab_delim
  • read_text_regex
  • wget
  • write_csv

Module io_mongo (uses pymongo)

  • read_mongo
  • write_mongo
  • columns_to_list
  • list_to_columns

Module io_pdf (requires pdftotext)

  • read_pdf_pages
  • read_pdf_files

Module io_xml

Module ietf

Module gis

Module fnb

Yaml Tags

  • !re <expression> Regular expression
  • !es <search string> Search a file using Everything

Install development version

  1. Clone the repo
  2. pip install <path> -e

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dataplaybook-0.3.0.tar.gz (25.0 kB view details)

Uploaded Source

File details

Details for the file dataplaybook-0.3.0.tar.gz.

File metadata

  • Download URL: dataplaybook-0.3.0.tar.gz
  • Upload date:
  • Size: 25.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.19.1 setuptools/39.0.1 requests-toolbelt/0.8.0 tqdm/4.23.4 CPython/3.6.5

File hashes

Hashes for dataplaybook-0.3.0.tar.gz
Algorithm Hash digest
SHA256 3687dbbd54294fb847a2cb675d9beb985b51324a3df20e51d29584b02d63df9d
MD5 2d1c1ffa795657b166b5123d6e5f3909
BLAKE2b-256 af7dc710d3e9d3f5f86c186751373d7d5902c88c0f98bd6985a43fa0a7e03a88

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page