Python scripts to upload primary metagenome and metatranscriptome assemblies to ENA on a per-study basis. This script generates xmls to register a new study and create manifests necessary for submission with webin-cli.
Project description
Public ENA Assembly uploader
Upload of metagenome and metatranscriptome assemblies to ENA
Pre-requisites:
- CSV metadata file. One per study. See test/fixtures/test_metadata for an example
- Compressed assembly fasta files in the locations defined in the metadata file
Set the following environmental variables with your webin details:
ENA_WEBIN
export ENA_WEBIN=Webin-0000
ENA_WEBIN_PASSWORD
export ENA_WEBIN_PASSWORD=password
Register study and generate pre-upload files - change this to python package installation instead?
The script needs python
, requests
, and ena-webin-cli
to run. Install the package:
python3 -m pip install -i https://test.pypi.org/simple/ --no-deps assemblyuploader==0.0.0
If you already have a registered study accession for your assembly files skip to step 3.
Step 1. This step will generate a folder STUDY_upload and a project XML and submission XML within it:
study_xmls
--study STUDY raw reads study ID
--library LIBRARY metagenome or metatranscriptome
--center CENTER center for upload e.g. EMG
--hold HOLD hold date (private) if it should be different from the provided study in format dd-mm-yyyy. Will inherit the release date of the raw read study if not
provided.
--tpa is the study a third party assembly. Default True
--publication PUBLICATION
pubmed ID for connected publication if available
Step 2. This step submit the XML to ENA and generate a new assembly study accession. Keep note of the newly generated study accession:
submit_study
--study STUDY raw reads study ID
--test run test submission only
Step 3. This step will generate manifest files in the folder STUDY_UPLOAD for runs specified in the metadata file:
assembly_manifest
--study STUDY raw reads study ID
--data DATA metadata CSV - run_id, coverage, assembler, version, filepath
--assembly_study ASSEMBLY_STUDY
pre-existing study ID to submit to if available. Must exist in the webin account
--force overwrite all existing manifests
Upload assemblies
Once manifest files are generated, it is necessary to use ENA's webin-cli resource to upload genomes.
To test your submission add the -test
argument.
A live execution example within this repo is the following:
ena-webin-cli \
-context=genome \
-manifest=SRR12240187.manifest \
-userName=$ENA_WEBIN \
-password=$ENA_WEBIN_PASSWORD \
-submit
More information on ENA's webin-cli can be found here.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for assembly_uploader-1.0.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 686705cf029f9228b649c00f15b5c04596c0b8c81655c3841bcf239affa3d1eb |
|
MD5 | 3ab7098d890287b9ebbbda860cb577a1 |
|
BLAKE2b-256 | beeb3da81c51fe00beac0d41b04978c1edee49e0c000c65fa352fab9cf9040fb |