Skip to main content

Converts an annotated DNA multi-sequence alignment (in NEXUS format) to an EMBL flatfile for submission to ENA via the Webin-CLI submission tool

Project description

annonex2embl

Build Status PyPI status PyPI pyversions PyPI version shields.io PyPI license

Converts an annotated DNA multi-sequence alignment (in NEXUS format) to an EMBL flatfile for submission to ENA via the Webin-CLI submission tool.

INSTALLATION

To get the most recent stable version of annonex2embl, run:

pip install annonex2embl

Or, alternatively, if you want to get the latest development version of annonex2embl, run:

pip install git+https://github.com/michaelgruenstaeudl/annonex2embl.git

INPUT, OUTPUT AND PREREQUISITES

  • Input: an annotated DNA multiple sequence alignment in NEXUS format; and a comma-delimited (CSV) metadata table
  • Output: a submission-ready, multi-record EMBL flatfile

Requirements / Input preparation

The annotations of a NEXUS file are specified via SETS-block, which is located beneath a DATA-block and defines sets of characters in the DNA alignment. In such a SETS-block, every gene and every exon charset must be accompanied by one CDS charset. Other charsets can be defined unaccompanied.

Example of a complete SETS-BLOCK

BEGIN SETS;
CHARSET matK_gene_forward = 929-2530;
CHARSET matK_CDS_forward = 929-2530;
CHARSET trnK_intron_forward = 1-928 2531-2813;
END;

Examples of corresponding DESCR variable

DESCR="tRNA-Lys (trnK) intron, partial sequence; maturase K (matK) gene, complete sequence"

EXAMPLE USAGE

cd into the annonex2embl package, then ...

On Linux / MacOS

SCRPT=$PWD/scripts/annonex2embl_launcher_CLI.py
INPUT=$PWD/examples/input/TestData1.nex
METAD=$PWD/examples/input/Metadata.csv
mkdir -p $PWD/examples/temp/
OTPUT=$PWD/examples/temp/TestData1.embl
DESCR='description of alignment here'  # Do not use double-quotes
EMAIL=your_email_here@yourmailserver.com
AUTHR='your name here'  # Do not use double-quotes
MNFTS=PRJEB00000
MNFTD=${DESCR//[^[:alnum:]]/_}

python3 $SCRPT -n $INPUT -c $METAD -d "$DESCR" -e $EMAIL -a "$AUTHR" -o $OTPUT --qualifiername "note" --productlookup --manifeststudy $MNFTS --manifestdescr $MNFTD --compress

On Windows

SET SCRPT=$PWD\scripts\annonex2embl_launcher_CLI.py
SET INPUT=$PWD\examples\input\TestData1.nex
SET METAD=$PWD\examples\input\Metadata.csv
mkdir $PWD\examples\temp\
SET OTPUT=$PWD\examples\temp\TestData1.embl
SET DESCR='description of alignment here'
SET EMAIL=your_email_here@yourmailserver.com
SET AUTHR='your name here'
SET MNFTS=PRJEB00000
SET MNFTD=a_unique_description_here

python %SCRPT% -n %INPUT% -c %METAD% -d %DESCR% -e %EMAIL% -a %AUTHR% -o %OTPUT% --productlookup --manifeststudy %MNFTS% --manifestdescr %MNFTD% --compress

CHANGELOG

See CHANGELOG.md for a list of recent changes to the software.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

annonex2embl-1.0.3.tar.gz (29.0 kB view hashes)

Uploaded Source

Built Distribution

annonex2embl-1.0.3-py3-none-any.whl (33.3 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page