Skip to main content

A package to create a preferred order of a FASTA scaffolds, useful for PretextSnapshot and SAMTOOLS reordering.

Project description

gnk_fastasort

A package to create a preferred order of a FASTA scaffolds, useful for PretextSnapshot and SAMTOOLS reordering.

This will reorganise a fasta file into groupings, based on chromosome naming. So unlocs of SUPER_1 will be re-ordered alongside it in size order. Other unlocs will be arranged in size order, after everything else.

Installation

git clone https://github.com/sanger-tol/gnk_fastasort.git

cd gnk_fastasort

pip install ./

fastasort -h

Usage

This script can be used in two ways, both however rely on the underlying naming of the fasta being SUPER_* style. In neither case is a fasta file required, it is designed to work in conjunction with SAMTOOLS faidx which will can re-organise a fasta given a tsv file of names. This tool emits a tsv of names, original naming, parent molecule and length or just names.

For example, a fasta fai file where scaffolds are named SUPER_* or a GCA accessioned fasta which was originally named SUPER_* (this is visible from the sequence report on ncbi.).

For genomes stored on NCBI and are accessioned. You can use the --gca_accession to call the API and download a sequence report.

fastasort --gca_accession GCA_964017025.1

For local fasta names where headers are SUPER_1, SUPER_2, SUPER_2_unloc_1 e.g. SUPER_* style, you can run:

fastasort --index {ASSEMBLY}.fa.fai

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gnk_fastasort-0.1.0.tar.gz (9.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gnk_fastasort-0.1.0-py3-none-any.whl (11.4 kB view details)

Uploaded Python 3

File details

Details for the file gnk_fastasort-0.1.0.tar.gz.

File metadata

  • Download URL: gnk_fastasort-0.1.0.tar.gz
  • Upload date:
  • Size: 9.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.7.2

File hashes

Hashes for gnk_fastasort-0.1.0.tar.gz
Algorithm Hash digest
SHA256 c9aecae371ad292aa7d5e3198e5471ff621d3a66b89d4a89738942e9ed2df668
MD5 061e13765fadea7e48b8bee2aafafe47
BLAKE2b-256 43b1e9f2ac008066a9faad0a5e375ab299d47b1dfe226561af3da54eadd9a5d3

See more details on using hashes here.

File details

Details for the file gnk_fastasort-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for gnk_fastasort-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 08189ab9100b0eab0a01917116654915e730da63d7b0f4de5492a58d07c7f15d
MD5 81fe6d903f28f36a0f7cee498fc4fed6
BLAKE2b-256 2c5fda6dd3daa78f1959404a910a7b88847d019f4eb18b9f21181ef2efb711ca

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page