Skip to main content

The Python GTF toolkit (pygtftk) package: easy handling of GTF files

Project description

Licence PyPI GitHub Documentation Status https://travis-ci.org/dputhier/pygtftk.svg?branch=master https://img.shields.io/github/repo-size/badges/shields.svg https://anaconda.org/guillaumecharbonnier/pygtftk/badges/installer/conda.svg https://anaconda.org/guillaumecharbonnier/pygtftk/badges/platforms.svg https://anaconda.org/guillaumecharbonnier/pygtftk/badges/latest_release_date.svg https://anaconda.org/guillaumecharbonnier/pygtftk/badges/downloads.svg

Python GTF toolkit (pygtftk)

The Python GTF toolkit (pygtftk) package is intented to ease handling of GTF/GFF2.0 files (Gene Transfer Format). It currently does not support GFF3 file format. The pygtftk package is compatible with Python >=3.5,<3.7 and relies on libgtftk, a library of functions written in C.

The package comes with a set of UNIX commands that can be accessed through the gtftk program. The gtftk program proposes several atomic tools to filter, convert, or extract data from GTF files. The gtftk set of Unix commands can be easily extended using a basic plugin architecture. All these aspects are covered in the help sections.

While the gtftk Unix program comes with hundreds of unitary and functional tests, it is still upon active development and may thus suffer from bugs that remain to be discovered. Feel free to post any problem or required enhancement in the issue section of the github repository.

System requirements

Depending on the size of the GTF file, pygtftk and gtftk may require lot of memory to perform selected tasks. A computer with 16Go is recommended in order to be able to pipe several commands when working with human annotations from ensembl release (e.g. 91). When working with a cluster think about reserving sufficient memory.

At the moment, the gtftk program has been tested on:

  • Linux (Ubuntu 12.04 and 18.04)

  • OSX (Yosemite, El Capitan, Mojave).

Installation

Installation through conda package building

Installation through conda should be the preferred install solution. The pygtftk package and gtftk command line tool require external dependencies with some version constrains.

If conda is not available on your system, first install miniconda from the official web site and make sure you have bioconda and conda-forge channels set up in the order below.

conda config --add channels defaults
conda config --add channels bioconda
conda config --add channels conda-forge

Then you can simply install pygtftk in its own isolated environment and activate it.

conda create -n pygtftk -c guillaumecharbonnier pygtftk
conda activate pygtftk

Installation through setup.py

This is not the preferred way for installation. Choose conda whenever possible. We have observed several issues with dependencies that still need to be fixed.

git clone git@github.com:dputhier/pygtftk.git pygtftk
cd pygtftk
# Check your Python version (>=3.5,<3.7)
pip install -r requirements.txt
python setup.py install

Installation through pip

Prerequisites

Again, this is not the preferred way for installation. Please choose conda whenever possible. We have observed several issues with dependencies that still need to be fixed.

Running pip

Installation through pip can be done as follow.

pip install -r requirements.txt
pip install pygtftk
# It is important to call gtftk -h
# to look for plugins and their
# CLI in ~/.gtftk
# before going further
gtftk -h

Documentation

Documentation about the latest release is dynamically produced and available at readthedoc server.

Testing

Running functional tests

A lot of functional tests have been developed to ensure consistency with expected results. This does not rule out that bugs may hide throughout the code… In order to check that installation is functional you may be interested in running functional tests. The definition of all functional tests declared in gtftk commands is accessible using the -p/–plugin-tests argument:

gtftk -p

To run the tests, you will need to install bats (Bash Automated Testing System). Once bats is installed run the following commands:

# The tests should be run in the pygtftk git
# directory because several tests contains references (relative path)
# to file enclosed in pygtftk/data directory.
gtftk -p > gtftk_test.bats
bats gtftk_test.bats

Note, alternatively you may directly call the tests using the Makefile.

make clean
make test

Or run tests in parallel using:

make clean
make test_para -j 10 # Using 10 cores

Running unitary tests

Several unitary tests have been implemented using doctests. You can run them using nose through the following command line:

make nose

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

pygtftk-1.0.0-cp36-cp36m-manylinux1_x86_64.whl (11.2 MB view details)

Uploaded CPython 3.6m

pygtftk-1.0.0-cp36-cp36m-macosx_10_7_x86_64.whl (16.2 MB view details)

Uploaded CPython 3.6mmacOS 10.7+ x86-64

pygtftk-1.0.0-cp35-cp35m-manylinux1_x86_64.whl (11.2 MB view details)

Uploaded CPython 3.5m

pygtftk-1.0.0-cp35-cp35m-macosx_10_9_x86_64.whl (16.2 MB view details)

Uploaded CPython 3.5mmacOS 10.9+ x86-64

File details

Details for the file pygtftk-1.0.0-cp36-cp36m-manylinux1_x86_64.whl.

File metadata

  • Download URL: pygtftk-1.0.0-cp36-cp36m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 11.2 MB
  • Tags: CPython 3.6m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.4.1 requests-toolbelt/0.8.0 tqdm/4.29.1 CPython/3.6.7

File hashes

Hashes for pygtftk-1.0.0-cp36-cp36m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 a4018a6de1b6599228c87c7a66c3363f92266aaa053e6263e820048835b07ee1
MD5 e7105592eb54623b38bc6dc387684f2c
BLAKE2b-256 5ec0c6a9288badd001dd251065c831c923574676261db1dcf08cfe339385ff7c

See more details on using hashes here.

File details

Details for the file pygtftk-1.0.0-cp36-cp36m-macosx_10_7_x86_64.whl.

File metadata

  • Download URL: pygtftk-1.0.0-cp36-cp36m-macosx_10_7_x86_64.whl
  • Upload date:
  • Size: 16.2 MB
  • Tags: CPython 3.6m, macOS 10.7+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.4.1 requests-toolbelt/0.8.0 tqdm/4.29.1 CPython/3.6.7

File hashes

Hashes for pygtftk-1.0.0-cp36-cp36m-macosx_10_7_x86_64.whl
Algorithm Hash digest
SHA256 39554ca4893affccfca2122329c64f5596797f972674a827cdde6bdf32fb4641
MD5 24c4be62ca37c22b6efe3b3f49d4703f
BLAKE2b-256 b0dcb12671ae0eab2a1a9118def643d4e8da6d6a2bcda3326ef3a74eec326a42

See more details on using hashes here.

File details

Details for the file pygtftk-1.0.0-cp35-cp35m-manylinux1_x86_64.whl.

File metadata

  • Download URL: pygtftk-1.0.0-cp35-cp35m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 11.2 MB
  • Tags: CPython 3.5m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.4.1 requests-toolbelt/0.8.0 tqdm/4.29.1 CPython/3.6.7

File hashes

Hashes for pygtftk-1.0.0-cp35-cp35m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 dc885e5094db19e3f0c30f213f4f1bba198f946a0d2c27e9a51c53cae35e5692
MD5 da0413f399a81d9c5e8b83079b4d15e2
BLAKE2b-256 b8e8fefc6c123561c5137a1e9f6dfff235a77fcff523f98dfcc38f6ff2c1b81e

See more details on using hashes here.

File details

Details for the file pygtftk-1.0.0-cp35-cp35m-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: pygtftk-1.0.0-cp35-cp35m-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 16.2 MB
  • Tags: CPython 3.5m, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.4.1 requests-toolbelt/0.8.0 tqdm/4.29.1 CPython/3.6.7

File hashes

Hashes for pygtftk-1.0.0-cp35-cp35m-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 6833e69f03adcbdc30a608047b32040f95efb7ff692eb1ea281d10d71f0873f0
MD5 2f1aadeb4447e2b1f9417e13b9c3cdbf
BLAKE2b-256 0c82d2fa7d38c3973267feee62f392d0d5fc0f6393d12ffaf66a0aea1c852d51

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page