Playbooks for data. Open, process and save table based data.
:book: Playbooks for data. Open, process and save table based data.
Automate repetitive tasks on table based data. Include various input and output tasks. Can be extended with custom modules.
pip install dataplaybook
The playbook.yaml file allows you to load additional modules (containing tasks) and specify the tasks to execute in sequence, with all their parameters.
tasks to perform typically follow the the structure of read, process, write.
Example yaml: (please note yaml is case sensitive)
modules: [list, of, modules] tasks: - task: *name tables: # List of tables used by this task target: # Target table name of this function debug*: True/False # default: False # task specific properties, refer to each task
Tasks are implemented as simple Python functions and the modules can be found in the dataplaybook/tasks folder.
- fuzzy_match (
pip install fuzzywuzzy)
io_xlsx (loaded by default)
io_misc (loaded by default)
io_mongo (uses pymongo)
io_pdf (requires pdftotext)
!re <expression>Regular expression
!es <search string>Search a file using Everything
Install development version
- Clone the repo
pip install <path> -e