Skip to main content

Which genome build again?

Project description

Which genome build again?

CircleCI codecov

A species-agnostic tool to figure out the probable genome build of a file. Currently supports bigWig and bed files, and an easy method to add your own genome build.

Install with pip

pip install wgba

Usage:

Auto detect file extension

wgba your_file.bed

Use a specific file extension, if your extension is non-standard for example

wgba -f bed your_file.bed
wgba -f bigwig your_file.bigWig

You can use shell globs to match multiple files, all of which will be processed independently

wgba *.bed 

You can mix and match supported files

wgba a_bed_file.bed and_a_bigwig.bw and_another.bigWig

Check if all of your files are on the same build with -c, --check

wgba -c *.bed *.bw 

Summarise non-conforming chromosomes with -s, --summary

wgba -s bad_file.bed

Adjust the tolerance for build assignment with -t, --tol. This is useful if you know that one chromosome will never match.

wgba -t 2 one_bad_chromosome.bed

Add a genome build to the database

wgba -f add_build your_build.chrom.sizes

Issues

This was written in about an hour for my own use. If you find something not working, please raise an issue.

Project details


Release history Release notifications | RSS feed

This version

1.0

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wgba-1.0.tar.gz (9.6 kB view hashes)

Uploaded Source

Built Distribution

wgba-1.0-py3-none-any.whl (9.2 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page