Tools for working with agp files
Project description
agptools
Tools for working with agp files
Full documentation at github pages.
Introduction
The AGP format is a tab-separated table format describing how components of a genome assembly fit together. NCBI accepts assemblies for submission in the format of a fasta file giving the sequences of components (usually contigs) along with an AGP file showing how these components are assembled into larger pieces like scaffolds or chromosomes. For this reason, scaffolders such as SALSA output this format alongside the fasta file of the scaffolded assembly containing gaps and all.
Unfortunately, scaffolding never really works perfectly, so you invariably have to correct mistakes or add in other sources of data such as synteny with a related species or a phyiscal map to get an assembly to chromosome level. While you could perform this manual curation process by editing the fasta file of scaffolds, I think it is a lot easier to leave the fasta file of contigs intact and move things around in the AGP file. For example, let's say the scaffolder misorients a contig. Fixing this in the fasta would require taking a chunk from the middle of a sequence, reverse orienting it, and pasting it back together. Fixing it in the AGP file is as simple as finding the line corresponding to that contig and changing the character in the orientation column from '+' to '-'.
agptools is a suite of scripts for performing edits to an AGP file during this manual curation stage of genome assembly. It contains modules for operations you might want to perform on an agp file, like splitting a contig or scaffold into multiple pieces, joining various scaffolds together into a superscaffold, reverse-complementing a piece of a scaffold, transforming a bed file from contig into scaffold coordinates, and removing or renaming scaffolds. Each of these use cases is explained in depth in the manual.
Installation
Unfortunately, agptools was taken on PyPI, so it is called bio_agptools instead.
pip install bio_agptools
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file bio_agptools-0.0.3.tar.gz.
File metadata
- Download URL: bio_agptools-0.0.3.tar.gz
- Upload date:
- Size: 26.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
74348b693cfe0a476844a8910c3c747db61a1af91bb7c44eff6184f90305a140
|
|
| MD5 |
1a0354a7e44696c9bad4335668c09159
|
|
| BLAKE2b-256 |
9bb52a965e7057db245873dd1e68eba1bd8aef9f5d35249cd6f22c25c4137a82
|
Provenance
The following attestation bundles were made for bio_agptools-0.0.3.tar.gz:
Publisher:
python-publish.yml on WarrenLab/agptools
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
bio_agptools-0.0.3.tar.gz -
Subject digest:
74348b693cfe0a476844a8910c3c747db61a1af91bb7c44eff6184f90305a140 - Sigstore transparency entry: 187753829
- Sigstore integration time:
-
Permalink:
WarrenLab/agptools@a681336a78e9b0729084e4d95b7e52f5c9afffb9 -
Branch / Tag:
refs/tags/v0.0.3 - Owner: https://github.com/WarrenLab
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@a681336a78e9b0729084e4d95b7e52f5c9afffb9 -
Trigger Event:
push
-
Statement type:
File details
Details for the file bio_agptools-0.0.3-py3-none-any.whl.
File metadata
- Download URL: bio_agptools-0.0.3-py3-none-any.whl
- Upload date:
- Size: 23.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
83a37f47abcf4b26f8329e03670887a36f0d7849f71e2193b073cb93d44ab3a0
|
|
| MD5 |
bcc839d3f49e87b31260f9f32f2ed9eb
|
|
| BLAKE2b-256 |
5195199f9b28d20b7e0f914c645920cc226e6ff4ada48df8832113d47b2a09e4
|
Provenance
The following attestation bundles were made for bio_agptools-0.0.3-py3-none-any.whl:
Publisher:
python-publish.yml on WarrenLab/agptools
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
bio_agptools-0.0.3-py3-none-any.whl -
Subject digest:
83a37f47abcf4b26f8329e03670887a36f0d7849f71e2193b073cb93d44ab3a0 - Sigstore transparency entry: 187753832
- Sigstore integration time:
-
Permalink:
WarrenLab/agptools@a681336a78e9b0729084e4d95b7e52f5c9afffb9 -
Branch / Tag:
refs/tags/v0.0.3 - Owner: https://github.com/WarrenLab
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@a681336a78e9b0729084e4d95b7e52f5c9afffb9 -
Trigger Event:
push
-
Statement type: