File data orchestrator
Project description
Housekeeper
Store, tag, fetch, and archive files with ease 🗃
Housekeeper is a tool that aims to provide:
- a backend for storing versioned bundles of files
- different interfaces (Python, CLI, REST) for fetching files based on tags
- a way to backup and retrieve bundles from long-term storage
Installation
Housekeeper written in Python 3.6+ and is available on the Python Package Index (PyPI).
poetry install
If you would like to install the latest development version:
git clone https://github.com/Clinical-Genomics/housekeeper
cd housekeeper
poetry install
Contributing
Housekeeper is using GitHub flow branching model as described in our development manual.
Documentation
Command line interface
Config file
Housekeeper supports a basic YAML config. The following options are supported:
---
database: mysql+pymysql://userName:passWord@domain.com/database
root: /path/to/root/dir
The root
option is used to store files within the Housekeeper context.
Command: init
Setup (or reset) the database. It will simply setup all the tables in the database. You can reset an existing database by using the --reset
option.
housekeeper --database "sqlite:///hk.sqlite3" init
Success! New tables: bundle, file, file_tag_link, tag, version
Command: include
Include (hard-link) all files of an existing bundle version into Housekeeper and the root
path.
housekeeper myBundle
This will only work if the bundle only has a single version which can be "imported". If you want to import a specific version of a bundle you can use the --version
option.
Command: delete files
Delete files that are not on disk anymore like his:
housekeeper delete files --tag fastq --notondisk
Remove all bam files before a certain date:
housekeeper delete files --tag bam --before 2017-06-15
Remove fastq files from a flowcell:
housekeeper delete files --tag fastq --tag H0HKKALXX
It'll always ask for confirmation, unless you add --yes:
housekeeper delete files --bundle sillyfish --yes
If you do not provide a --tag or --bundle, essentially deleting everything, the function will not let you do that.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file housekeeper-4.13.9.tar.gz
.
File metadata
- Download URL: housekeeper-4.13.9.tar.gz
- Upload date:
- Size: 45.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.4 CPython/3.12.7 Linux/6.5.0-1025-azure
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 363b19a7945be8b1729905e8e065f51d9be1ad0c3a8996367000fc45a6252c52 |
|
MD5 | b940ff662a99da61566f5e769fef7c69 |
|
BLAKE2b-256 | aa1e73a9f6c817fade9a2e01f5f0fae3014b36afc2f579b7afda1441af8c9399 |
File details
Details for the file housekeeper-4.13.9-py3-none-any.whl
.
File metadata
- Download URL: housekeeper-4.13.9-py3-none-any.whl
- Upload date:
- Size: 74.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.4 CPython/3.12.7 Linux/6.5.0-1025-azure
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 658cc195d8dfd24d1856830d624ca1ea5c99930c2d4263f99570d465e4dd18f8 |
|
MD5 | a72fc857a98ec3f1b74eaf984cf1020c |
|
BLAKE2b-256 | 991a2037f035353b1cfda387fc3082b98e24749e6b15afe178db00c2a0895480 |