Skip to main content

Tools for working with agp files

Project description

Python package Tests Status Coverage Status

agptools

Tools for working with agp files

Full documentation at github pages.

Introduction

The AGP format is a tab-separated table format describing how components of a genome assembly fit together. NCBI accepts assemblies for submission in the format of a fasta file giving the sequences of components (usually contigs) along with an AGP file showing how these components are assembled into larger pieces like scaffolds or chromosomes. For this reason, scaffolders such as SALSA output this format alongside the fasta file of the scaffolded assembly containing gaps and all.

Unfortunately, scaffolding never really works perfectly, so you invariably have to correct mistakes or add in other sources of data such as synteny with a related species or a phyiscal map to get an assembly to chromosome level. While you could perform this manual curation process by editing the fasta file of scaffolds, I think it is a lot easier to leave the fasta file of contigs intact and move things around in the AGP file. For example, let's say the scaffolder misorients a contig. Fixing this in the fasta would require taking a chunk from the middle of a sequence, reverse orienting it, and pasting it back together. Fixing it in the AGP file is as simple as finding the line corresponding to that contig and changing the character in the orientation column from '+' to '-'.

agptools is a suite of scripts for performing edits to an AGP file during this manual curation stage of genome assembly. It contains modules for operations you might want to perform on an agp file, like splitting a contig or scaffold into multiple pieces, joining various scaffolds together into a superscaffold, reverse-complementing a piece of a scaffold, transforming a bed file from contig into scaffold coordinates, and removing or renaming scaffolds. Each of these use cases is explained in depth in the manual.

Installation

Unfortunately, agptools was taken on PyPI, so it is called bio_agptools instead.

pip install bio_agptools

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bio_agptools-0.0.3.tar.gz (26.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

bio_agptools-0.0.3-py3-none-any.whl (23.5 kB view details)

Uploaded Python 3

File details

Details for the file bio_agptools-0.0.3.tar.gz.

File metadata

  • Download URL: bio_agptools-0.0.3.tar.gz
  • Upload date:
  • Size: 26.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for bio_agptools-0.0.3.tar.gz
Algorithm Hash digest
SHA256 74348b693cfe0a476844a8910c3c747db61a1af91bb7c44eff6184f90305a140
MD5 1a0354a7e44696c9bad4335668c09159
BLAKE2b-256 9bb52a965e7057db245873dd1e68eba1bd8aef9f5d35249cd6f22c25c4137a82

See more details on using hashes here.

Provenance

The following attestation bundles were made for bio_agptools-0.0.3.tar.gz:

Publisher: python-publish.yml on WarrenLab/agptools

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file bio_agptools-0.0.3-py3-none-any.whl.

File metadata

  • Download URL: bio_agptools-0.0.3-py3-none-any.whl
  • Upload date:
  • Size: 23.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for bio_agptools-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 83a37f47abcf4b26f8329e03670887a36f0d7849f71e2193b073cb93d44ab3a0
MD5 bcc839d3f49e87b31260f9f32f2ed9eb
BLAKE2b-256 5195199f9b28d20b7e0f914c645920cc226e6ff4ada48df8832113d47b2a09e4

See more details on using hashes here.

Provenance

The following attestation bundles were made for bio_agptools-0.0.3-py3-none-any.whl:

Publisher: python-publish.yml on WarrenLab/agptools

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page