A package to automatically access the inverted repeats of archived plastid genomes
Project description
airpg: Accessing the inverted repeats of archived plastid genomes
A Python package for automatically accessing the inverted repeats of thousands of plastid genomes stored on NCBI Nucleotide
INSTALLATION
To get the most recent stable version of airpg, run:
pip install airpg
Or, alternatively, if you want to get the latest development version of airpg, run:
pip install git+https://github.com/michaelgruenstaeudl/airpg.git
EXAMPLE USAGE
SCRIPT 01: Generating plastome availability table
# Angiosperms
TESTFOLDER=./03_testing/angiosperms_Start2000toEnd2019
DATE=$(date '+%Y_%m_%d')
MYQUERY='complete genome[TITLE] AND (chloroplast[TITLE] OR plastid[TITLE]) AND 2000/01/01:2019/12/31[PDAT] AND 0000050000:00000250000[SLEN] NOT unverified[TITLE] NOT partial[TITLE] AND (Embryophyta[ORGN] AND Magnoliophyta[ORGN])'
AVAILTABLE=plastome_availability_table_${DATE}.tsv
mkdir -p $TESTFOLDER
# Non-angiosperm landplants
TESTFOLDER=./03_testing/nonangiosperm_landplants_Start2000toEnd2019
DATE=$(date '+%Y_%m_%d')
MYQUERY='complete genome[TITLE] AND (chloroplast[TITLE] OR plastid[TITLE]) AND 2000/01/01:2019/12/31[PDAT] AND 0000050000:00000250000[SLEN] NOT unverified[TITLE] NOT partial[TITLE] AND (Embryophyta[ORGN] NOT Magnoliophyta[ORGN])'
AVAILTABLE=plastome_availability_table_${DATE}.tsv
mkdir -p $TESTFOLDER
# Defining blacklist
if [ ! -f ./02_blacklists/BLACKLIST__master_${DATE} ]; then
cat $(ls ./02_blacklists/BLACKLIST__* | grep -v "master") > ./02_blacklists/BLACKLIST__master_${DATE}
fi
python ./01_package/01_generate_plastome_availability_table.py -q "$MYQUERY" -o $TESTFOLDER/$AVAILTABLE --blacklist ./02_blacklists/BLACKLIST__master_${DATE} 1>>$TESTFOLDER/Script01_${DATE}.runlog 2>&1
SCRIPT 02: Downloading records and extracting IR information
REPRTDSTAT=reported_IR_stats_table_${DATE}.tsv
mkdir -p $TESTFOLDER/records_${DATE}
mkdir -p $TESTFOLDER/data_${DATE}
python ./01_package/02_download_records_and_extract_IRs.py -i $TESTFOLDER/$AVAILTABLE -r $TESTFOLDER/records_${DATE}/ -d $TESTFOLDER/data_${DATE}/ -o $TESTFOLDER/$REPRTDSTAT 1>>$TESTFOLDER/Script02_${DATE}.runlog 2>&1
CHANGELOG
See CHANGELOG.md
for a list of recent changes to the software.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
airpg-0.1.0.tar.gz
(15.5 kB
view details)
File details
Details for the file airpg-0.1.0.tar.gz
.
File metadata
- Download URL: airpg-0.1.0.tar.gz
- Upload date:
- Size: 15.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.0 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.52.0 CPython/3.9.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a044abfa4f70d7c1dc84a7ae4bbf3d722dc5f36ddd6c8312aab2b8c70fc02601 |
|
MD5 | 864274fd0c0472f62dc8eeef078c76bb |
|
BLAKE2b-256 | 8843b8938d3fd0b56151e28e4c0c90a829611e87723441c3d576cf807a20c543 |