File data orchestrator
Project description
Housekeeper
Store, tag, fetch, and archive files with ease 🗃
Housekeeper is a tool that aims to provide:
- a backend for storing versioned bundles of files
- different interfaces (Python, CLI, REST) for fetching files based on tags
- a way to backup and retrieve bundles from long-term storage
Installation
Housekeeper written in Python 3.6+ and is available on the Python Package Index (PyPI).
poetry install
If you would like to install the latest development version:
git clone https://github.com/Clinical-Genomics/housekeeper
cd housekeeper
poetry install
Contributing
Housekeeper is using GitHub flow branching model as described in our development manual.
Documentation
Command line interface
Config file
Housekeeper supports a basic YAML config. The following options are supported:
---
database: mysql+pymysql://userName:passWord@domain.com/database
root: /path/to/root/dir
The root
option is used to store files within the Housekeeper context.
Command: init
Setup (or reset) the database. It will simply setup all the tables in the database. You can reset an existing database by using the --reset
option.
housekeeper --database "sqlite:///hk.sqlite3" init
Success! New tables: bundle, file, file_tag_link, tag, version
Command: include
Include (hard-link) all files of an existing bundle version into Housekeeper and the root
path.
housekeeper myBundle
This will only work if the bundle only has a single version which can be "imported". If you want to import a specific version of a bundle you can use the --version
option.
Command: delete files
Delete files that are not on disk anymore like his:
housekeeper delete files --tag fastq --notondisk
Remove all bam files before a certain date:
housekeeper delete files --tag bam --before 2017-06-15
Remove fastq files from a flowcell:
housekeeper delete files --tag fastq --tag H0HKKALXX
It'll always ask for confirmation, unless you add --yes:
housekeeper delete files --bundle sillyfish --yes
If you do not provide a --tag or --bundle, essentially deleting everything, the function will not let you do that.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for housekeeper-4.13.6-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7afac88b58d9381f224f4a7ebe7e2660902f94c0c6ae2b0f96a7142cccd504f7 |
|
MD5 | 8f3933db63ecbaf75990043aada3a018 |
|
BLAKE2b-256 | ee626665d0b9bf890cfe31bdf3e3074f4e1756b022289070e76bb41e3f218e66 |