Utility to fetch public and private RAW read and assembly files from the ENA
Project description
Microbiome Informatics ENA fetch tool
Set of tools which allows you to fetch RAW read and assembly files from the European Nucleotide Archive (ENA).
Install fetch tool
Install from Pypi
$ pip install fetch-tool
Install from the git repo
$ pip install https://github.com/EBI-Metagenomics/fetch_tool/archive/master.zip
Configuration options
The tool has a number of options, with sensible defaults for the most common use cases.
Setup the configuration file, the template fetchdata-config-template.json for the configuration file.
The required fields are:
- ena_api_user
- ena_api_password
Fetch read files (amplicon and WGS data)
Usage
$ fetch-read-tool -h
usage: fetch-read-tool [-h] [-p PROJECTS [PROJECTS ...] | -l PROJECT_LIST] [-d DIR] [-v] [--version] [-f] [--ignore-errors] [--private] [-i] [-c CONFIG_FILE] [--fix-desc-file] [-ru RUNS [RUNS ...]
| --run-list RUN_LIST]
optional arguments:
-h, --help show this help message and exit
-p PROJECTS [PROJECTS ...], --projects PROJECTS [PROJECTS ...]
Whitespace separated list of project accession(s)
-l PROJECT_LIST, --project-list PROJECT_LIST
File containing line-separated project list
-d DIR, --dir DIR Base directory for downloads
-v, --verbose Verbose
--version Version
-f, --force Ignore download errors and force re-download all files
--ignore-errors Ignore download errors and continue
--private Use when fetching private data
-i, --interactive interactive mode - allows you to skip failed downloads.
-c CONFIG_FILE, --config-file CONFIG_FILE
Alternative config file
--fix-desc-file Fixed runs in project description file
-ru RUNS [RUNS ...], --runs RUNS [RUNS ...]
Run accession(s), whitespace separated. Use to download only certain project runs
--run-list RUN_LIST File containing line-separated run accessions
Example
Download amplicon study:
$ fetch-read-tool -p SRP062869 -v -d /home/<user>/temp/
Fetch assembly files
Usage
fetch-assembly-tool -h
usage: fetch-assembly-tool [-h] [-p PROJECTS [PROJECTS ...] | -l PROJECT_LIST] [-d DIR] [-v] [--version] [-f] [--ignore-errors] [--private] [-i] [-c CONFIG_FILE] [--fix-desc-file]
[-as ASSEMBLIES [ASSEMBLIES ...]] [--assembly-type {primary metagenome,binned metagenome,metatranscriptome}] [--assembly-list ASSEMBLY_LIST]
optional arguments:
-h, --help show this help message and exit
-p PROJECTS [PROJECTS ...], --projects PROJECTS [PROJECTS ...]
Whitespace separated list of project accession(s)
-l PROJECT_LIST, --project-list PROJECT_LIST
File containing line-separated project list
-d DIR, --dir DIR Base directory for downloads
-v, --verbose Verbose
--version Version
-f, --force Ignore download errors and force re-download all files
--ignore-errors Ignore download errors and continue
--private Use when fetching private data
-i, --interactive interactive mode - allows you to skip failed downloads.
-c CONFIG_FILE, --config-file CONFIG_FILE
Alternative config file
--fix-desc-file Fixed runs in project description file
-as ASSEMBLIES [ASSEMBLIES ...], --assemblies ASSEMBLIES [ASSEMBLIES ...]
Assembly ERZ accession(s), whitespace separated. Use to download only certain project assemblies
--assembly-type {primary metagenome,binned metagenome,metatranscriptome}
Assembly type
--assembly-list ASSEMBLY_LIST
File containing line-separated assembly accessions
Example
Download assembly study:
$ fetch-assembly-tool -p ERP111288 -v -d /home/<user>/temp/
How to set up your development environment
We recommend you to use miniconda|conda to manage the environment.
Clone the repo and install the requirements.
$ git clone git@github.com:EBI-Metagenomics/fetch_tool.git
$ cd fetch_tool
$ # activate anv (conda activate xxx)
$ pip install .[dev]
Pre-commit hooks
Setup the git pre-commit hook:
pre-commit install
Why?
pre-commit will run a set of pre-configured tools before allowing you to commit files. You can find the currently configure hooks and configurations in .pre-commit-config.yaml
Tests
This repo uses pytest.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file fetch_tool-1.0.4.tar.gz
.
File metadata
- Download URL: fetch_tool-1.0.4.tar.gz
- Upload date:
- Size: 17.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.9.20
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 71dc8dc1c980d8016bf3a35a58bcd9528ee0c69b2c643b64340b4d240a0a409a |
|
MD5 | 600136ce1fb5b5709ab538290d88aa76 |
|
BLAKE2b-256 | 8235822cd07683f5700e78f4ccbb860d7123885f84a4bde6d9803b97c4899987 |
File details
Details for the file fetch_tool-1.0.4-py3-none-any.whl
.
File metadata
- Download URL: fetch_tool-1.0.4-py3-none-any.whl
- Upload date:
- Size: 21.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.9.20
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 220dbd273945dbbbb5acd936ceffd14efa89a5cb1abbf140460f7165b7614170 |
|
MD5 | 0177c501e63a5545f2b9e0f217cdccce |
|
BLAKE2b-256 | 568c9813f12231be83b7fba640c407a60a29c3465e920f439e21d8883781b371 |