Skip to main content

No project description provided

Project description

Pansa

Pansa is a command-line tool and a Python package to build a syntelog-based pan-genome matrix.

Overview

  • Extract Longest Protein: Extract the longest amino acid sequence for each gene from the DNA sequence file using the GFF3 annotation file for subsequent synteny identification.
  • Run DAGchainer: Perform synteny analysis to identify syntenic gene pairs.
  • Merge Syntelog Results: Merge syntenic results to generate the final pan-genome matrix.
  • Polish Matrix: Generate a more readable and simplified pan-genome matrix, and export a plot in various formats (PNG, SVG, PDF).
  • Other Functions: Includes reading parameters from configuration files, controlling log output levels, and more.

Installation

You can install Pansa using pip:

pip install pansa

Usage

It is recommended to create a new folder to run the processes in this software to avoid unnecessary bugs.

Extract Longest Protein

pansa extract_longest_protein <config_file> --verbose <ERROR|INFO|DEBUG>

Run DAGchainer

pansa blastp_2_dagchainer <config_file> --thread <number_of_threads> --D <distance> --g <gap_length> --A <aligned_pairs> --evalue <evalue_threshold> --verbose <ERROR|INFO|DEBUG>

Merge Syntelog Results

pansa merge <config_file> --output <output_file> --verbose <ERROR|INFO|DEBUG>

Polish Matrix

pansa polish <config_file> --pan_matrix <pan_matrix_file> --outpng --outsvg --outpdf --verbose <ERROR|INFO|DEBUG>

Input File Format

Configuration File Format

The configuration file should contain the following format:

<sample_name>    <fasta_path>    <gff3_path>

For example:

Mo17    data/Zm-Mo17-REFERENCE-CAU-2.0.fa       data/Zm-Mo17-REFERENCE-CAU-2.0_Zm00014ba.gff3
B73     data/Zm-B73-REFERENCE-NAM-5.0.fa        data/Zm-B73-REFERENCE-NAM-5.0_Zm00001eb.1.gff3
Oh7B    data/Zm-Oh7B-REFERENCE-NAM-1.0.fa       data/Zm-Oh7B-REFERENCE-NAM-1.0_Zm00038ab.1.gff3

or:

Mo17    Mo17/Zm-Mo17-REFERENCE-CAU-2.0.fa       Mo17/Zm-Mo17-REFERENCE-CAU-2.0_Zm00014ba.gff3
B73     B73/Zm-B73-REFERENCE-NAM-5.0.fa        B73/Zm-B73-REFERENCE-NAM-5.0_Zm00001eb.1.gff3
Oh7B    Oh7B/Zm-Oh7B-REFERENCE-NAM-1.0.fa       Oh7B/Zm-Oh7B-REFERENCE-NAM-1.0_Zm00038ab.1.gff3

Example

Suppose you have a configuration file config.txt with the following content:

sample1    /data/sample1.fasta    /data/sample1.gff3
sample2    /data/sample2.fasta    /data/sample2.gff3
  1. Extract the longest protein sequence:
pansa extract_longest_protein config.txt --verbose INFO
  1. Run DAGchainer:
pansa blastp_2_dagchainer config.txt --thread 8 --D 1000000 --g 40000 --A 5 --evalue 1e-5 --verbose INFO
  1. Merge syntelog results:
pansa merge config.txt --output SG_test --verbose INFO
  1. Polish the matrix:
pansa polish config.txt --pan_matrix SG_test --outpng --outsvg --outpdf --verbose INFO

Log Output Levels

You can control the log output level using the --verbose parameter. The available options are: CRITICAL, ERROR, WARNING, INFO, DEBUG, NOTSET.

Developer Information

For more details and technical support, please contact the developers or visit the project homepage.

E-mail:caocao@cau.edu.cn# pansg

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pansg-0.9.4.tar.gz (22.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pansg-0.9.4-py3-none-any.whl (29.1 kB view details)

Uploaded Python 3

File details

Details for the file pansg-0.9.4.tar.gz.

File metadata

  • Download URL: pansg-0.9.4.tar.gz
  • Upload date:
  • Size: 22.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.4.0 CPython/3.9.18 Linux/6.5.0-35-generic

File hashes

Hashes for pansg-0.9.4.tar.gz
Algorithm Hash digest
SHA256 3bba9932957a717339b1d8451dedd55136f72435ea36e74ef570a7d7d9905032
MD5 799028a33c66621f56a54898347f8da5
BLAKE2b-256 7d30115a04d168e3c36a67dd675c7eee82c550ad60e855912fdefde03c6c1c21

See more details on using hashes here.

File details

Details for the file pansg-0.9.4-py3-none-any.whl.

File metadata

  • Download URL: pansg-0.9.4-py3-none-any.whl
  • Upload date:
  • Size: 29.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.4.0 CPython/3.9.18 Linux/6.5.0-35-generic

File hashes

Hashes for pansg-0.9.4-py3-none-any.whl
Algorithm Hash digest
SHA256 5ef2d1d0ac35aa5280f33c58da118a9df9275f3fb6e701bce3d535eb9c25ff7c
MD5 00dabf20b3142b9a4fe05c915d61e428
BLAKE2b-256 11abae3df89ea7b860c9d90851b0d41be37fc9f678be050effd6b0700fd8033b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page