Miscelanelous python-based bioinformatics utils
Project description
blindschleiche
A collection of bioinformatics / sequence utilities needed for my research, and hopefully useful for yours.
Install
pip install blindschleiche
# or for the current main branch:
# pip install git+https://github.com/kdm9/blindschleiche.git
Usage
USAGE: blsl <subtool> [options...]
Where <subtool> is one of:
telogrep: Search contigs for known telomere repeats
n50: Calculate N50 and total length of a set of contigs
falen: Tabulate the lengths of sequences in a FASTA file
mask2bed: The inverse of bedtools maskfasta: softmasked fasta -> unmasked fasta + mask.bed
pansn-rename: Add, remove, or modify PanSN-style prefixes to contig/chromosome names in references
genigvjs: Generate a simple IGV.js visualisation of some bioinf files.
ildemux: Demultiplex modern illumina reads from read headers.
ilsample: Sample a fraction of read pairs from an interleaved fastq file
regionbed: Make a bed/region file of genome windows
uniref-acc2taxid: Make a ncbi-style acc2taxid.map file for a uniref fasta
nstitch: Combine R1 + R2 into single sequences, with an N in the middle
gg2k: Summarise a table with GreenGenes-style lineages into a kraken-style report.
equalbestblast: Output only the best blast hits.
tabcat: Concatenate table (c/tsv) files, adding the filename as a column
esearchandfetch: Use the Entrez API to search for and download something. A CLI companion to the NCBI search box
deepclust2fa: Split a .faa by the clusters diamond deepclust finds
farename: Rename sequences in a fasta file sequentially
gffcat: Concatenate GFF3 files, resepcting header lines and FASTA sections
gffparse: Format a GFF sanely
gffcsqify: Format a reasonably compliant GFF for use with bcftools csq
gfftagsane: Sanitise a messy gff attribute column to just simple tags
liftoff-gff3: Obtain an actually-useful GFF3 from Liftoff by fixing basic GFF3 format errors
ebiosra2rl2s: INTERNAL: MPI Tübingen tool. Make a runlib-to-sample map table from ebio sra files
galhist: Make a summary histogram of git-annex-list output
pairslash: Add an old-style /1 /2 pair indicator to paired-end fastq files
vcfstats: Use bcftools to calculate various statistics, outputing an R-ready table
vcfparallel: Parallelise a bcf processing pipeline across regions
shannon-entropy: Calculate Shannon's entropy (in bits) at each column of one or more alignments
fastasanitiser: Sanitise fasta IDs to something sane, then back again
tidyqc: What if MultiQC was in the tidyverse? (and much worse)
jsonl2csv: Parse jsonlines into a C/TSV
help: Print this help message
Use blsl subtool --help to get help about a specific tool
Why the name Blindschleiche?
- They're awesome animals
- Their English name is Slow Worm, which is appropriate for this set of low-performance tools in Python.
- All tools implemented in Python must be named with a snake pun, and they're kinda a snake (not really, they're legless lizards)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
blindschleiche-0.4.0.tar.gz
(36.4 kB
view details)
Built Distribution
File details
Details for the file blindschleiche-0.4.0.tar.gz
.
File metadata
- Download URL: blindschleiche-0.4.0.tar.gz
- Upload date:
- Size: 36.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.9.20
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e8ced1b3a197e56419f539bbfd2250018897724cf27befca1e58c43a921feb87 |
|
MD5 | 520cc0bf74e0f4db344745cd07cca034 |
|
BLAKE2b-256 | 3e2a749b7bfbe2ca7c6f10990ee7798eccc8e3f1d1b31c78b0bdb5eb408bad7e |
File details
Details for the file blindschleiche-0.4.0-py3-none-any.whl
.
File metadata
- Download URL: blindschleiche-0.4.0-py3-none-any.whl
- Upload date:
- Size: 52.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.9.20
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a8aa3e743c78cc07c20180e6be2a831b78f11e043d2f547f2202dfe9eed50829 |
|
MD5 | 34a508e5f6cf1f9082c93c82564b33d8 |
|
BLAKE2b-256 | 7950e3412f3f32a14f4fbdabfd101173454aa3bb31acae06812e1857580b8bf8 |