Skip to main content

A package to create a preferred order of a FASTA scaffolds, useful for PretextSnapshot and SAMTOOLS reordering.

Project description

gnk_fastasort

A package to create a preferred order of a FASTA scaffolds, useful for PretextSnapshot and SAMTOOLS reordering.

This will reorganise a fasta file into groupings, based on chromosome naming. So unlocs of SUPER_1 will be re-ordered alongside it in size order. Other unlocs will be arranged in size order, after everything else.

Installation

git clone https://github.com/sanger-tol/gnk_fastasort.git

cd gnk_fastasort

pip install ./

fastasort -h

Usage

This script can be used in two ways, both however rely on the underlying naming of the fasta being SUPER_* style. In neither case is a fasta file required, it is designed to work in conjunction with SAMTOOLS faidx which will can re-organise a fasta given a tsv file of names. This tool emits a tsv of names, original naming, parent molecule and length or just names.

For example, a fasta fai file where scaffolds are named SUPER_* or a GCA accessioned fasta which was originally named SUPER_* (this is visible from the sequence report on ncbi.).

For genomes stored on NCBI and are accessioned. You can use the --gca_accession to call the API and download a sequence report.

fastasort --gca_accession GCA_964017025.1

For local fasta names where headers are SUPER_1, SUPER_2, SUPER_2_unloc_1 e.g. SUPER_* style, you can run:

fastasort --index {ASSEMBLY}.fa.fai

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gnk_fastasort-0.1.2.tar.gz (9.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gnk_fastasort-0.1.2-py3-none-any.whl (11.5 kB view details)

Uploaded Python 3

File details

Details for the file gnk_fastasort-0.1.2.tar.gz.

File metadata

  • Download URL: gnk_fastasort-0.1.2.tar.gz
  • Upload date:
  • Size: 9.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for gnk_fastasort-0.1.2.tar.gz
Algorithm Hash digest
SHA256 c73fdd7add9cd8e4167e1befa3a59ba47d58f56a30b7fb734956737d549a688d
MD5 4dff8fa26f00a5d7951789bef064bdcd
BLAKE2b-256 e4d356105f5897a16d29709c498cf9e33dd9e80b02bdff32c84240710e3ff154

See more details on using hashes here.

Provenance

The following attestation bundles were made for gnk_fastasort-0.1.2.tar.gz:

Publisher: release.yml on sanger-tol/gnk_fastasort

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file gnk_fastasort-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: gnk_fastasort-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 11.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for gnk_fastasort-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 0579ba6d2d042420f2cda3cffe43ec35c30b4ee49fa838a9add1c67b4411d3d0
MD5 14d463a8b7032e6115a81bb75f4272f1
BLAKE2b-256 1d84fe3f06e5ef06c48a3f773b92ff74af14485d4c14253906b36b14ccab9723

See more details on using hashes here.

Provenance

The following attestation bundles were made for gnk_fastasort-0.1.2-py3-none-any.whl:

Publisher: release.yml on sanger-tol/gnk_fastasort

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page