Library for handling data in the GFA1 and GFA2 formats
Project description
Gfapy
The Graphical Fragment Assembly (GFA) are formats for the representation of sequence graphs, including assembly, variation and splicing graphs. Two versions of GFA have been defined (GFA1 and GFA2) and several sequence analysis programs have been adopting the formats as an interchange format, which allow to easily combine different sequence analysis tools.
This library implements the GFA1 and GFA2 specification described at https://github.com/GFA-spec/GFA-spec/blob/master/GFA-spec.md. It allows to create a Gfa object from a file in the GFA format or from scratch, to enumerate the graph elements (segments, links, containments, paths and header lines), to traverse the graph (by traversing all links outgoing from or incoming to a segment), to search for elements (e.g. which links connect two segments) and to manipulate the graph (e.g. to eliminate a link or a segment or to duplicate a segment distributing the read counts evenly on the copies).
The GFA format can be easily extended by users by defining own custom tags and record types. In Gfapy, it is easy to write extensions modules, which allow to define custom record types and datatypes for the parsing and validation of custom fields. The custom lines can be connected, using references, to each other and to lines of the standard record types.
Requirements
Gfapy has been written for Python 3 and tested using Python version 3.7. It does not require any additional Python packages or other software.
Installation
Gfapy is distributed as a Python package and can be installed using the Python package manager pip, as well as conda (in the Bioconda channel). It is also available as a package in some Linux distributions (Debian, Ubuntu).
The following command installs the current stable version from the Python Packages index:
pip install gfapy
If you would like to install the current development version from Github, use the following command:
pip install -e git+https://github.com/ggonnella/gfapy.git#egg=gfapy
Alternatively it is possible to install gfapy using conda. Gfapy is included in the Bioconda (https://bioconda.github.io/) channel:
conda install -c bioconda gfapy
Usage
If you installed gfapy as described above, you can import it in your script using the conventional Python syntax:
>>> import gfapy
Documentation
The documentation, including this introduction to Gfapy, a user manual and the API documentation is hosted on the ReadTheDocs server, at the URL http://gfapy.readthedocs.io/en/latest/ and it can be downloaded as PDF from the URL https://github.com/ggonnella/gfapy/blob/master/manual/gfapy-manual.pdf.
References
Giorgio Gonnella and Stefan Kurtz “GfaPy: a flexible and extensible software library for handling sequence graphs in Python”, Bioinformatics (2017) btx398 https://doi.org/10.1093/bioinformatics/btx398
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file gfapy-1.2.3.tar.gz
.
File metadata
- Download URL: gfapy-1.2.3.tar.gz
- Upload date:
- Size: 417.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.7.0 requests/2.25.1 setuptools/49.6.0.post20210108 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.7.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 718eb58401a55a4174833ecc4cb020bcccf6ebec6a0ba969302a515ad65c8a4f |
|
MD5 | 3ba04e9f6e6d8c37a69c7be20f401ee8 |
|
BLAKE2b-256 | 2fbea360f77258e972b343cf49e4d96dd4daee2f0aaee39d8de5d3b3f29959ae |
File details
Details for the file gfapy-1.2.3-py3-none-any.whl
.
File metadata
- Download URL: gfapy-1.2.3-py3-none-any.whl
- Upload date:
- Size: 238.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.7.0 requests/2.25.1 setuptools/49.6.0.post20210108 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.7.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 17ec0284340064dbf5ef9b60aa79b8410161c78ea5cf91922fd8ceee325679cf |
|
MD5 | 4e85c1279238b73096cfe3a3797fdbf7 |
|
BLAKE2b-256 | e16d14d65ce8e2c522968501a866a99cba59a7ee16a10517d90606dbfed2eaf8 |