Skip to main content

No project description provided

Project description

Pansa

Pansa is a command-line tool and a Python package to build a syntelog-based pan-genome matrix.

Overview

  • Extract Longest Protein: Extract the longest amino acid sequence for each gene from the DNA sequence file using the GFF3 annotation file for subsequent synteny identification.
  • Run DAGchainer: Perform synteny analysis to identify syntenic gene pairs.
  • Merge Syntelog Results: Merge syntenic results to generate the final pan-genome matrix.
  • Polish Matrix: Generate a more readable and simplified pan-genome matrix, and export it in various formats (PNG, SVG, PDF).
  • Other Functions: Includes reading parameters from configuration files, controlling log output levels, and more.

Installation

You can install Pansa using pip:

pip install pansa

Usage

It is recommended to create a new folder to run the processes in this software to avoid unnecessary bugs.

Extract Longest Protein

pansa extract_longest_protein <config_file> --verbose <ERROR|INFO|DEBUG>

Run DAGchainer

pansa blastp_2_dagchainer <config_file> --thread <number_of_threads> --D <distance> --g <gap_length> --A <aligned_pairs> --evalue <evalue_threshold> --verbose <ERROR|INFO|DEBUG>

Merge Syntelog Results

pansa merge <config_file> --output <output_file> --verbose <ERROR|INFO|DEBUG>

Polish Matrix

pansa polish <config_file> --pan_matrix <pan_matrix_file> --outpng --outsvg --outpdf --verbose <ERROR|INFO|DEBUG>

Input File Format

Configuration File Format

The configuration file should contain the following format:

<sample_name>    <fasta_path>    <gff3_path>

For example:

Mo17    data/Zm-Mo17-REFERENCE-CAU-2.0.fa       data/Zm-Mo17-REFERENCE-CAU-2.0_Zm00014ba.gff3
B73     data/Zm-B73-REFERENCE-NAM-5.0.fa        data/Zm-B73-REFERENCE-NAM-5.0_Zm00001eb.1.gff3
Oh7B    data/Zm-Oh7B-REFERENCE-NAM-1.0.fa       data/Zm-Oh7B-REFERENCE-NAM-1.0_Zm00038ab.1.gff3

or:

Mo17    Mo17/Zm-Mo17-REFERENCE-CAU-2.0.fa       Mo17/Zm-Mo17-REFERENCE-CAU-2.0_Zm00014ba.gff3
B73     B73/Zm-B73-REFERENCE-NAM-5.0.fa        B73/Zm-B73-REFERENCE-NAM-5.0_Zm00001eb.1.gff3
Oh7B    Oh7B/Zm-Oh7B-REFERENCE-NAM-1.0.fa       Oh7B/Zm-Oh7B-REFERENCE-NAM-1.0_Zm00038ab.1.gff3

Example

Suppose you have a configuration file config.txt with the following content:

sample1    /data/sample1.fasta    /data/sample1.gff3
sample2    /data/sample2.fasta    /data/sample2.gff3
  1. Extract the longest protein sequence:
pansa extract_longest_protein config.txt --verbose INFO
  1. Run DAGchainer:
pansa blastp_2_dagchainer config.txt --thread 8 --D 1000000 --g 40000 --A 5 --evalue 1e-5 --verbose INFO
  1. Merge syntelog results:
pansa merge config.txt --output SG_test --verbose INFO
  1. Polish the matrix:
pansa polish config.txt --pan_matrix SG_test --outpng --outsvg --outpdf --verbose INFO

Log Output Levels

You can control the log output level using the --verbose parameter. The available options are: CRITICAL, ERROR, WARNING, INFO, DEBUG, NOTSET.

Developer Information

For more details and technical support, please contact the developers or visit the project homepage.

E-mail:caocao@cau.edu.cn

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pansg-0.1.2.tar.gz (11.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pansg-0.1.2-py3-none-any.whl (13.3 kB view details)

Uploaded Python 3

File details

Details for the file pansg-0.1.2.tar.gz.

File metadata

  • Download URL: pansg-0.1.2.tar.gz
  • Upload date:
  • Size: 11.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.4.0 CPython/3.9.18 Linux/6.5.0-35-generic

File hashes

Hashes for pansg-0.1.2.tar.gz
Algorithm Hash digest
SHA256 91184c75aadcc1ec9077bac3fc26b4ae36e67b35ae4b734d8c166894530e202d
MD5 4b1c2540e7172239db1a9d4a59cadbe7
BLAKE2b-256 8e2805c5c2a5a0fcb5a2360e8e6c7d746049d68e89a79f89100f929a0517322e

See more details on using hashes here.

File details

Details for the file pansg-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: pansg-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 13.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.4.0 CPython/3.9.18 Linux/6.5.0-35-generic

File hashes

Hashes for pansg-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 326b01c874398abb9592c5abbbcafe65247163619d43f54d0a8aeab54368ee1c
MD5 300939f8b0f908a7ce7953e7bf4779c5
BLAKE2b-256 2d95902b7093062cdaca2dc3959cfa67b585a3b592db437b2f0916bad44714da

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page