Skip to main content

panISa is a software to search insertion sequence (IS) on resequencing data (bam file)

Project description

panISa is a software identifying insertion sequence (IS) on resequencing data (bam file) in bacterial genomes.

Idea

The panISa software searches for Insertion Sequences on NGS data ab initio (i.e. with a database-free approach) in bacterial genomes from short read data. Briefly, the software identifies a signature of insertion in the alignment by counting clipped reads on the start and end positions of the potential IS. These clipped reads overlap the direct repeats due to IS insertion. Finally, using a reconstruction of the beginning of both sides of the IS (IRL and IRR), panISa validates the IS by searching for inverted repeat regions.

Principe of panISa

Requirements and Installation

Conda installation

You can easy install panisa program and requirements using conda:

conda install -c bioconda panisa

Requirements

The program used the python library pysam (>=0.9) and request (>=2.12)

You need to install the emboss package

In debian, type:

sudo apt-get install python-pysam python-requests emboss

Installation

Download the current tarball and unzip it.

Verify the installation using the test file

python panISa.py test/test.bam

Alternatively, you can install from PyPI repository

pip install panisa

Command and Options

python panISa.py [options] bam

Options

-h

show this help message and exit

-o

Return list of IS insertion by alignment [stdout]

-q

Minimum alignment quality value to conserve a clipped read [20]

-m

Minimum number of clipped reads to look at IS on a position [10]

-s

Maximum size of direct repeat region [20bp]

-p

Minimum percentage of same base to create consensus [0.8]

-v

show program’s version number and exit

Output

PanISa returns result in tabular format with the following columns:

Chromosome:

chromosome id

End position:

position of the last base of the direct repeat and the left bondary of the potential IS (IRL)

End clipped reads:

number of clipped reads (end position)

Direct repeat:

nucleotidic sequence of the direct repeat

Start position:

position of the first base of the direct repeat and the right bondary of the potential IS (IRR)

Start clipped reads:

number of clipped reads (start position)

Inverted repeats:

nucleotidic sequence of inverted repeats and their position

IS left sequence:

reconstruction of the left boundary of the potential IS (IRL)

IS right sequence:

reconstruction of the right boundary of the potential IS (IRR)

Validation

PanISa results can be search for homology against ISFinder to find IS familly using the script ISFinder_search.py

python ISFinder_search.py [options] panISa results

Recommandation

panISa works well with the alignment from bwa software.

Citation

If you use the panISa software, please cite the following paper:

panISa: ab initio detection of insertion sequences in bacterial genomes from short read sequence data. Treepong P, Guyeux C, Meunier A, Couchoud C, Hocquet D, Valot B. Bioinformatics. 2018, 34(22):3795-3800.

doi: 10.1093/bioinformatics/bty479

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

panisa-0.1.7.tar.gz (58.7 kB view details)

Uploaded Source

Built Distribution

panisa-0.1.7-py2.py3-none-any.whl (28.2 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file panisa-0.1.7.tar.gz.

File metadata

  • Download URL: panisa-0.1.7.tar.gz
  • Upload date:
  • Size: 58.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.9.19

File hashes

Hashes for panisa-0.1.7.tar.gz
Algorithm Hash digest
SHA256 5c82cf7bd007639d4cc8e00fac8bcbdfd2d5d6fe467be1265690c013490e7f02
MD5 0f356408fa1c76fab948e45740a9a466
BLAKE2b-256 894ab23ec057bad3bc323e521e18f2f4816ac344258bf99df499750d6b855dda

See more details on using hashes here.

File details

Details for the file panisa-0.1.7-py2.py3-none-any.whl.

File metadata

  • Download URL: panisa-0.1.7-py2.py3-none-any.whl
  • Upload date:
  • Size: 28.2 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.9.19

File hashes

Hashes for panisa-0.1.7-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 7403d16ffba6447619408174f3199b9a171009be3e19a413d48df59e5c3d0053
MD5 3ce768819734522b0a6edc26e889e2b8
BLAKE2b-256 4772535f8bf3339a464f90ed014753a11f94aa36254799139586781801165fe9

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page