Skip to main content

File data orchestrator

Project description

Housekeeper

Housekeeper tests Coverage Status CodeFactor Code style: black

Store, tag, fetch, and archive files with ease 🗃

Housekeeper is a tool that aims to provide:

  • a backend for storing versioned bundles of files
  • different interfaces (Python, CLI, REST) for fetching files based on tags
  • a way to backup and retrieve bundles from long-term storage

Installation

Housekeeper written in Python 3.6+ and is available on the Python Package Index (PyPI).

poetry install

If you would like to install the latest development version:

git clone https://github.com/Clinical-Genomics/housekeeper
cd housekeeper
poetry install

Contributing

Housekeeper is using GitHub flow branching model as described in our development manual.

Documentation

Command line interface

Config file

Housekeeper supports a basic YAML config. The following options are supported:

---
database: mysql+pymysql://userName:passWord@domain.com/database
root: /path/to/root/dir

The root option is used to store files within the Housekeeper context.

Command: init

Setup (or reset) the database. It will simply setup all the tables in the database. You can reset an existing database by using the --reset option.

housekeeper --database "sqlite:///hk.sqlite3" init
Success! New tables: bundle, file, file_tag_link, tag, version

Command: include

Include (hard-link) all files of an existing bundle version into Housekeeper and the root path.

housekeeper myBundle

This will only work if the bundle only has a single version which can be "imported". If you want to import a specific version of a bundle you can use the --version option.

Command: delete files

Delete files that are not on disk anymore like his: housekeeper delete files --tag fastq --notondisk

Remove all bam files before a certain date: housekeeper delete files --tag bam --before 2017-06-15

Remove fastq files from a flowcell: housekeeper delete files --tag fastq --tag H0HKKALXX

It'll always ask for confirmation, unless you add --yes: housekeeper delete files --bundle sillyfish --yes

If you do not provide a --tag or --bundle, essentially deleting everything, the function will not let you do that.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

housekeeper-4.13.9.tar.gz (45.6 kB view details)

Uploaded Source

Built Distribution

housekeeper-4.13.9-py3-none-any.whl (74.2 kB view details)

Uploaded Python 3

File details

Details for the file housekeeper-4.13.9.tar.gz.

File metadata

  • Download URL: housekeeper-4.13.9.tar.gz
  • Upload date:
  • Size: 45.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.4 CPython/3.12.7 Linux/6.5.0-1025-azure

File hashes

Hashes for housekeeper-4.13.9.tar.gz
Algorithm Hash digest
SHA256 363b19a7945be8b1729905e8e065f51d9be1ad0c3a8996367000fc45a6252c52
MD5 b940ff662a99da61566f5e769fef7c69
BLAKE2b-256 aa1e73a9f6c817fade9a2e01f5f0fae3014b36afc2f579b7afda1441af8c9399

See more details on using hashes here.

File details

Details for the file housekeeper-4.13.9-py3-none-any.whl.

File metadata

  • Download URL: housekeeper-4.13.9-py3-none-any.whl
  • Upload date:
  • Size: 74.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.4 CPython/3.12.7 Linux/6.5.0-1025-azure

File hashes

Hashes for housekeeper-4.13.9-py3-none-any.whl
Algorithm Hash digest
SHA256 658cc195d8dfd24d1856830d624ca1ea5c99930c2d4263f99570d465e4dd18f8
MD5 a72fc857a98ec3f1b74eaf984cf1020c
BLAKE2b-256 991a2037f035353b1cfda387fc3082b98e24749e6b15afe178db00c2a0895480

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page