Genome reference manager.
RefChef is a reference management tool used to: (1) document the exact steps undertaken in the retrieval of genomic references; (2) maintain the associated metadata; (3) provide a mechanism for automatically reproducing retrieval and creation of an exact copy of genomic references.
To install from PyPI using pip:
pip install refchef
To install using Anaconda Python:
conda install -c compbiocore refchef
To install a development version from the current directory:
git clone https://github.com/compbiocore/refchef.git cd refchef pip install -e .
Run unit tests as:
python setup.py test
.env file with GitHub Access Token
Sensitive environment variables are stored in the .env file. This file is included in .gitignore intentionally, so that it is never committed.
- Create a
.envfile and copy into it the contents of
- Get your GitHub Access Token and add to the
RefChef comes with two main commands (
When using either of the commands, you'll be prompted to create a
.refchef-config file. Alternatively,
you can create the config file in your home directory.
Here's an example of
config-yaml: path-settings: reference-directory: ~/data/references_dir # directory where references will be downloaded and processed. github-directory: ~/data/git_local # local git repository where `master.yaml` is located. remote-repository: user/repo # remote user and repository for version control of `master.yaml` log-settings: log: 'yes' runtime-settings: break-on-error: 'yes' verbose: 'yes'
This command will read a
master.yaml located in the
github-directory path from the config file. The
master.yaml file contains a list of references, as well as metadata, and commands necessary to download them (see example below).
--exectue, -e: will execute all commands listed in the
master.yaml for each reference, if reference doesn't exist in the location provided in the config file.
--new, -n: path to a new yaml file containing other references to be downloaded and appended to the
--update, -u: whether to update the remote git repository with the new
1 - This will read in
new.yaml file, append to
master.yaml and update the remote GitHub repository.
refchef-cook -e --new new.yaml --update
2 - This will process `master.yaml` only and won't update the remote GitHub repository: `refchef-cook -e`
reference_test1: metadata: name: reference_test1 species: mouse organization: ucsc downloader: aleith levels: references: - component: primary retrieve: true commands: - curl https://s3.us-east-2.amazonaws.com/refchef-tests/chr1.fa.gz - md5 *.fa.gz > postdownload_checksums.md5 - gunzip *.gz - md5 *.fa > final_checksums.md5 reference_test2: metadata: name: reference_test2 species: human organization: ucsc downloader: fgelin levels: references: - component: primary retrieve: true commands: - curl https://s3.us-east-2.amazonaws.com/refchef-tests/chr1.fa.gz - md5 *.fa.gz > postdownload_checksums.md5 - gunzip *.gz - md5 *.fa > final_checksums.md5
This command provides a way for the user to list all references present in the system, based on
master.yaml, as well as filter the list of references based on metadata options.
--filter: used to filter references based on metadata. Takes a pair key:value, or a list of pairs separated by comma:
refchef-menu --filter species:human
Contact email@example.com - this is our general help line, so please specify that your issue is with this site's contents
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
|Filename, size & hash SHA256 hash help||File type||Python version||Upload date|
|refchef-0.0.1-py3-none-any.whl (28.2 kB) Copy SHA256 hash SHA256||Wheel||py3|
|refchef-0.0.1.tar.gz (11.9 kB) Copy SHA256 hash SHA256||Source||None|