Python scripts to upload primary metagenome and metatranscriptome assemblies to ENA on a per-study basis. This script generates xmls to register a new study and create manifests necessary for submission with webin-cli.
Project description
Public ENA Assembly uploader
Upload of metagenome and metatranscriptome assemblies to the European Nucleotide Archive (ENA)
Pre-requisites:
- CSV metadata file. One per study. See test/fixtures/test_metadata for an example
- Compressed assembly fasta files in the locations defined in the metadata file
Set the following environmental variables with your webin details:
ENA_WEBIN
export ENA_WEBIN=Webin-0000
ENA_WEBIN_PASSWORD
export ENA_WEBIN_PASSWORD=password
Installation
Install the package:
pip install assembly-uploader
Register study and generate pre-upload files
If you already have a registered study accession for your assembly files skip to step 3.
Step 1
This step will generate a folder STUDY_upload and a project XML and submission XML within it:
study_xmls
--study STUDY raw reads study ID
--library LIBRARY metagenome or metatranscriptome
--center CENTER center for upload e.g. EMG
--hold HOLD hold date (private) if it should be different from the provided study in format dd-mm-yyyy. Will inherit the release date of the raw read study if not
provided.
--tpa use this flag if the study a third party assembly. Default False
--publication PUBLICATION
pubmed ID for connected publication if available
Step 2
This step submit the XML to ENA and generate a new assembly study accession. Keep note of the newly generated study accession:
submit_study
--study STUDY raw reads study ID
--test run test submission only
Step 3
This step will generate manifest files in the folder STUDY_UPLOAD for runs specified in the metadata file:
assembly_manifest
--study STUDY raw reads study ID
--data DATA metadata CSV - run_id, coverage, assembler, version, filepath
--assembly_study ASSEMBLY_STUDY
pre-existing study ID to submit to if available. Must exist in the webin account
--force overwrite all existing manifests
Upload assemblies
Once manifest files are generated, it is necessary to use ENA's webin-cli resource to upload genomes.
To test your submission add the -test
argument.
A live execution example within this repo is the following:
ena-webin-cli \
-context=genome \
-manifest=SRR12240187.manifest \
-userName=$ENA_WEBIN \
-password=$ENA_WEBIN_PASSWORD \
-submit
More information on ENA's webin-cli can be found here.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file assembly_uploader-1.0.3.tar.gz
.
File metadata
- Download URL: assembly_uploader-1.0.3.tar.gz
- Upload date:
- Size: 12.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.1.1 CPython/3.12.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 31da69163c6616073d94b403dec6f99265863d7940d86205aca4e5afb4428921 |
|
MD5 | 3362faca3cae4788e01ccbbfd1502b3e |
|
BLAKE2b-256 | b692356282cbc0ab7d177b56a0f16cc4d62f812329e8a85dec0685a44241261a |
File details
Details for the file assembly_uploader-1.0.3-py3-none-any.whl
.
File metadata
- Download URL: assembly_uploader-1.0.3-py3-none-any.whl
- Upload date:
- Size: 15.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.1.1 CPython/3.12.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 26a8387572f075290ab570abcf96c70cdcf187d6785fb8f381473091da27e0ef |
|
MD5 | 6e42dd3898eeadc63037af54b4d949c9 |
|
BLAKE2b-256 | 0c68e2d82bfae9a0228c319e77febbc484067ae40f570fc93431e7275e6c6bec |