Skip to main content

Tools for working with agp files

Project description

Python package Tests Status Coverage Status

agptools

Tools for working with agp files

Full documentation at github pages.

Introduction

The AGP format is a tab-separated table format describing how components of a genome assembly fit together. NCBI accepts assemblies for submission in the format of a fasta file giving the sequences of components (usually contigs) along with an AGP file showing how these components are assembled into larger pieces like scaffolds or chromosomes. For this reason, scaffolders such as SALSA output this format alongside the fasta file of the scaffolded assembly containing gaps and all.

Unfortunately, scaffolding never really works perfectly, so you invariably have to correct mistakes or add in other sources of data such as synteny with a related species or a phyiscal map to get an assembly to chromosome level. While you could perform this manual curation process by editing the fasta file of scaffolds, I think it is a lot easier to leave the fasta file of contigs intact and move things around in the AGP file. For example, let's say the scaffolder misorients a contig. Fixing this in the fasta would require taking a chunk from the middle of a sequence, reverse orienting it, and pasting it back together. Fixing it in the AGP file is as simple as finding the line corresponding to that contig and changing the character in the orientation column from '+' to '-'.

agptools is a suite of scripts for performing edits to an AGP file during this manual curation stage of genome assembly. It contains modules for operations you might want to perform on an agp file, like splitting a contig or scaffold into multiple pieces, joining various scaffolds together into a superscaffold, reverse-complementing a piece of a scaffold, transforming a bed file from contig into scaffold coordinates, and removing or renaming scaffolds. Each of these use cases is explained in depth in the manual.

Installation

Unfortunately, agptools was taken on PyPI, so it is called bio_agptools instead.

pip install bio_agptools

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bio_agptools-0.0.2.tar.gz (26.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

bio_agptools-0.0.2-py3-none-any.whl (23.5 kB view details)

Uploaded Python 3

File details

Details for the file bio_agptools-0.0.2.tar.gz.

File metadata

  • Download URL: bio_agptools-0.0.2.tar.gz
  • Upload date:
  • Size: 26.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for bio_agptools-0.0.2.tar.gz
Algorithm Hash digest
SHA256 84b7330255636f297f8e7db92ebdacddce3ce05e2786dd0b5c5ec8063b09d152
MD5 375fb7b1e2cc6395449ad7ec8e671094
BLAKE2b-256 574661aa8d14808d86a06965d9b087cbd0762ff045f315db99a378cff1646daf

See more details on using hashes here.

Provenance

The following attestation bundles were made for bio_agptools-0.0.2.tar.gz:

Publisher: python-publish.yml on WarrenLab/agptools

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file bio_agptools-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: bio_agptools-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 23.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for bio_agptools-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 936dbac8d9d3d90609545dff8c1ed86b2d56ae0714030df6762fee5379748ac2
MD5 e6317585e168c29771dbea37bd694fcd
BLAKE2b-256 222bac08f2bce10c77e54f595aa8d007dab23dbf94757147472fecc966c50aa3

See more details on using hashes here.

Provenance

The following attestation bundles were made for bio_agptools-0.0.2-py3-none-any.whl:

Publisher: python-publish.yml on WarrenLab/agptools

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page