Skip to main content

No project description provided

Project description

Pansa

Pansa is a command-line tool and a Python package to build a syntelog-based pan-genome matrix.

Overview

  • Extract Longest Protein: Extract the longest amino acid sequence for each gene from the DNA sequence file using the GFF3 annotation file for subsequent synteny identification.
  • Run DAGchainer: Perform synteny analysis to identify syntenic gene pairs.
  • Merge Syntelog Results: Merge syntenic results to generate the final pan-genome matrix.
  • Polish Matrix: Generate a more readable and simplified pan-genome matrix, and export it in various formats (PNG, SVG, PDF).
  • Other Functions: Includes reading parameters from configuration files, controlling log output levels, and more.

Installation

You can install Pansa using pip:

pip install pansa

Usage

It is recommended to create a new folder to run the processes in this software to avoid unnecessary bugs.

Extract Longest Protein

pansa extract_longest_protein <config_file> --verbose <ERROR|INFO|DEBUG>

Run DAGchainer

pansa blastp_2_dagchainer <config_file> --thread <number_of_threads> --D <distance> --g <gap_length> --A <aligned_pairs> --evalue <evalue_threshold> --verbose <ERROR|INFO|DEBUG>

Merge Syntelog Results

pansa merge <config_file> --output <output_file> --verbose <ERROR|INFO|DEBUG>

Polish Matrix

pansa polish <config_file> --pan_matrix <pan_matrix_file> --outpng --outsvg --outpdf --verbose <ERROR|INFO|DEBUG>

Input File Format

Configuration File Format

The configuration file should contain the following format:

<sample_name>    <fasta_path>    <gff3_path>

For example:

Mo17    data/Zm-Mo17-REFERENCE-CAU-2.0.fa       data/Zm-Mo17-REFERENCE-CAU-2.0_Zm00014ba.gff3
B73     data/Zm-B73-REFERENCE-NAM-5.0.fa        data/Zm-B73-REFERENCE-NAM-5.0_Zm00001eb.1.gff3
Oh7B    data/Zm-Oh7B-REFERENCE-NAM-1.0.fa       data/Zm-Oh7B-REFERENCE-NAM-1.0_Zm00038ab.1.gff3

or:

Mo17    Mo17/Zm-Mo17-REFERENCE-CAU-2.0.fa       Mo17/Zm-Mo17-REFERENCE-CAU-2.0_Zm00014ba.gff3
B73     B73/Zm-B73-REFERENCE-NAM-5.0.fa        B73/Zm-B73-REFERENCE-NAM-5.0_Zm00001eb.1.gff3
Oh7B    Oh7B/Zm-Oh7B-REFERENCE-NAM-1.0.fa       Oh7B/Zm-Oh7B-REFERENCE-NAM-1.0_Zm00038ab.1.gff3

Example

Suppose you have a configuration file config.txt with the following content:

sample1    /data/sample1.fasta    /data/sample1.gff3
sample2    /data/sample2.fasta    /data/sample2.gff3
  1. Extract the longest protein sequence:
pansa extract_longest_protein config.txt --verbose INFO
  1. Run DAGchainer:
pansa blastp_2_dagchainer config.txt --thread 8 --D 1000000 --g 40000 --A 5 --evalue 1e-5 --verbose INFO
  1. Merge syntelog results:
pansa merge config.txt --output SG_test --verbose INFO
  1. Polish the matrix:
pansa polish config.txt --pan_matrix SG_test --outpng --outsvg --outpdf --verbose INFO

Log Output Levels

You can control the log output level using the --verbose parameter. The available options are: CRITICAL, ERROR, WARNING, INFO, DEBUG, NOTSET.

Developer Information

For more details and technical support, please contact the developers or visit the project homepage.

E-mail:caocao@cau.edu.cn

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pansg-0.1.5.tar.gz (11.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pansg-0.1.5-py3-none-any.whl (13.8 kB view details)

Uploaded Python 3

File details

Details for the file pansg-0.1.5.tar.gz.

File metadata

  • Download URL: pansg-0.1.5.tar.gz
  • Upload date:
  • Size: 11.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.4.0 CPython/3.9.18 Linux/6.5.0-35-generic

File hashes

Hashes for pansg-0.1.5.tar.gz
Algorithm Hash digest
SHA256 2c52952c36dd5a8e928c3d85c723d0dc9059ac33d2360506977203e7b85a99da
MD5 706f6bf07ea7543a8cac371ed7845bb4
BLAKE2b-256 7287d950b1636bbb178ec69810f36fcf755b296ca78339dc2695503b137f3860

See more details on using hashes here.

File details

Details for the file pansg-0.1.5-py3-none-any.whl.

File metadata

  • Download URL: pansg-0.1.5-py3-none-any.whl
  • Upload date:
  • Size: 13.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.4.0 CPython/3.9.18 Linux/6.5.0-35-generic

File hashes

Hashes for pansg-0.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 0fe04becbc8326afe407dbf38f5006e4e346c7655f849a06da27ac9efb453c3d
MD5 565ba0885ee2719b595f9cfa36f9af71
BLAKE2b-256 a43bef632c77b4518d050c1992b3a7c37595546d3764e9c7d0ce2b4c82ae404f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page