Skip to main content

No project description provided

Project description

Pansa

Pansa is a command-line tool and a Python package to build a syntelog-based pan-genome matrix.

Overview

  • Extract Longest Protein: Extract the longest amino acid sequence for each gene from the DNA sequence file using the GFF3 annotation file for subsequent synteny identification.
  • Run DAGchainer: Perform synteny analysis to identify syntenic gene pairs.
  • Merge Syntelog Results: Merge syntenic results to generate the final pan-genome matrix.
  • Polish Matrix: Generate a more readable and simplified pan-genome matrix, and export it in various formats (PNG, SVG, PDF).
  • Other Functions: Includes reading parameters from configuration files, controlling log output levels, and more.

Installation

You can install Pansa using pip:

pip install pansa

Usage

It is recommended to create a new folder to run the processes in this software to avoid unnecessary bugs.

Extract Longest Protein

pansa extract_longest_protein <config_file> --verbose <ERROR|INFO|DEBUG>

Run DAGchainer

pansa blastp_2_dagchainer <config_file> --thread <number_of_threads> --D <distance> --g <gap_length> --A <aligned_pairs> --evalue <evalue_threshold> --verbose <ERROR|INFO|DEBUG>

Merge Syntelog Results

pansa merge <config_file> --output <output_file> --verbose <ERROR|INFO|DEBUG>

Polish Matrix

pansa polish <config_file> --pan_matrix <pan_matrix_file> --outpng --outsvg --outpdf --verbose <ERROR|INFO|DEBUG>

Input File Format

Configuration File Format

The configuration file should contain the following format:

<sample_name>    <fasta_path>    <gff3_path>

For example:

Mo17    data/Zm-Mo17-REFERENCE-CAU-2.0.fa       data/Zm-Mo17-REFERENCE-CAU-2.0_Zm00014ba.gff3
B73     data/Zm-B73-REFERENCE-NAM-5.0.fa        data/Zm-B73-REFERENCE-NAM-5.0_Zm00001eb.1.gff3
Oh7B    data/Zm-Oh7B-REFERENCE-NAM-1.0.fa       data/Zm-Oh7B-REFERENCE-NAM-1.0_Zm00038ab.1.gff3

or:

Mo17    Mo17/Zm-Mo17-REFERENCE-CAU-2.0.fa       Mo17/Zm-Mo17-REFERENCE-CAU-2.0_Zm00014ba.gff3
B73     B73/Zm-B73-REFERENCE-NAM-5.0.fa        B73/Zm-B73-REFERENCE-NAM-5.0_Zm00001eb.1.gff3
Oh7B    Oh7B/Zm-Oh7B-REFERENCE-NAM-1.0.fa       Oh7B/Zm-Oh7B-REFERENCE-NAM-1.0_Zm00038ab.1.gff3

Example

Suppose you have a configuration file config.txt with the following content:

sample1    /data/sample1.fasta    /data/sample1.gff3
sample2    /data/sample2.fasta    /data/sample2.gff3
  1. Extract the longest protein sequence:
pansa extract_longest_protein config.txt --verbose INFO
  1. Run DAGchainer:
pansa blastp_2_dagchainer config.txt --thread 8 --D 1000000 --g 40000 --A 5 --evalue 1e-5 --verbose INFO
  1. Merge syntelog results:
pansa merge config.txt --output SG_test --verbose INFO
  1. Polish the matrix:
pansa polish config.txt --pan_matrix SG_test --outpng --outsvg --outpdf --verbose INFO

Log Output Levels

You can control the log output level using the --verbose parameter. The available options are: CRITICAL, ERROR, WARNING, INFO, DEBUG, NOTSET.

Developer Information

For more details and technical support, please contact the developers or visit the project homepage.

E-mail:caocao@cau.edu.cn

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pansg-0.1.3.tar.gz (11.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pansg-0.1.3-py3-none-any.whl (13.8 kB view details)

Uploaded Python 3

File details

Details for the file pansg-0.1.3.tar.gz.

File metadata

  • Download URL: pansg-0.1.3.tar.gz
  • Upload date:
  • Size: 11.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.4.0 CPython/3.9.18 Linux/6.5.0-35-generic

File hashes

Hashes for pansg-0.1.3.tar.gz
Algorithm Hash digest
SHA256 ddd8f8b01a1b8ff8635d4862e564afaf2c77fe909f1170e3beb618e58c7fcb00
MD5 d20d976f5176728417ab5f9ebd002d6e
BLAKE2b-256 4c27ac26814ec58ed6f318718116d41765b014a02d6d758831b6547550d65525

See more details on using hashes here.

File details

Details for the file pansg-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: pansg-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 13.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.4.0 CPython/3.9.18 Linux/6.5.0-35-generic

File hashes

Hashes for pansg-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 a997e9a6236e15cc55ebdba49cb2ea358f37ef2fc5917f1e03d55569f498bd58
MD5 da4a3dc50d33563c4cd6c2f911dfa7c7
BLAKE2b-256 50bb3408a26007beaee4d865bde4131d8f2ced2cf00fae08ef935d0d083b97b0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page