Small tools/scripts written in Python for MDU
Project description
MDU Python Tools
Background
Some simple tools in python for MDU
Tools
mdu-merge-ngs-lanes
Use it to correctly merge lanes from an Illumina run into the a single FASTQ.
Get help:
mdu-merge-ngs-lanes --help
Basic usage:
mdu-merge-ngs-lanes -i /path/to/fastq_folder -o /path/to/output > cmd.sh
Advanced usage:
You can split the output to muliple subfolders of the output folder by adding --subfolder
to the command line. The option can be used multiple times, and takes two space separated values as input:
path
regex
. The path
gives a name of the subfolder in the output folder, and the regex
expression
determines which samples go in that subfolder.
For instance, the command below will split samples starting the NTC in to a subfolder called ntc
,
while all other samples will be added to a subfolder called data
.
mdu-merge-ngs-lanes -i /path/to/fastq -o /path/to/output --subfolder 'data' '(?!NTC).*' --subfolder 'ntc' '(?<=NTC).*' > cmd.sh
mdu-sra-uploads
Use to it to upload FASTQ data to NCBI SRA.
Requires a file with tab-separated values of MDU ID
and AUSMDUID
. For example:
mdu1\tausmdu1
mdu2\tausmdu2
Getting help:
mdu-sra-uploads --help
Usage: mdu-sra-upload [OPTIONS] ISOLATES
Options:
-f, --folder TEXT Folder on NCBI to upload. Used to find the reads
when submitting via the SRA portal. [default:
mdu]
-r, --reads-folder TEXT Where reads are located (uses MDU_READS env
variable if available).
-k, --ascp-key TEXT Path to ascp ssh upload key (uses ASCP_UPLOAD_KEY
env variable if available). This can be obtained
from the SRA Submission Portal.
-s, --sra-subfolder TEXT SRA subfolder owned by you where data will copied
to (uses SRA_SUBFOLDER env variable is available).
--help Show this message and exit.
Basic usage:
cd /path/for/upload
# copy paste isolates.txt
mdu-sra-uploads isolates.txt
# when completing the submission, search for pre-uploaded files in the folder called mdu
Environmental variables that can be used to set options
MDU_READS
: full path to where FASTQ data is storedASCP_UPLOAD_KEY
: full path to where your Aspera NCBI upload key is located (obtain one from the SRA submission portal under the Aspera command line instructions)SRA_FOLDER
: path to your folder at SRA. Usually composed by youremail
plus an "_" and some random alphanumeric characters. This can be obtained from SRA submission portal under the Aspera command line instructions (e.g.,john.doe@doe.industries.com_qEWo9
).
Development
Development environment
To develop with the same environment use vagrant
and virtualbox
:
vagrant up
vagrant ssh
Once logged in to the VM, the shared folder is in /vagrant
.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for mdu_pytools-0.2.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5c141c4e18859e1790561ea89a3fc9c4b3fbefd8093dfe140de9526d0713f25e |
|
MD5 | d0493489550d0c5ecae6f47664841dde |
|
BLAKE2b-256 | 674ae2128223c0a56c9c02eda68215935eb08dd31b7a8abf20a60c501b770ebf |