A Python tool for panresistome analysis
Project description
PanR2: Panresistome Analysis Tool
Overview
PanR2 is a comprehensive Python-based tool for analyzing panresistome data. It processes NCBI and Abricate summary files, merges the data, and generates a wide range of visualizations including heatmaps, bar plots, boxplots, and interactive HTML plots. The tool is designed to help researchers analyze and visualize antibiotic resistance gene presence, prevalence, and distribution patterns across different geographic locations and temporal scales.
Prerequisites:
Key Features:
- Merges and analyzes NCBI and Abricate outputs
- Calculates gene presence/absence across multiple categories (continent, location, subcontinent, collection date)
- Performs comprehensive statistical tests and correlation analyses
- Generates multiple visualization types: heatmaps, bar plots, boxplots, lollipop plots, and correlation plots
- Creates interactive HTML visualizations for enhanced data exploration
- Generates an interactive HTML index for easy navigation of all results
- Provides detailed statistical analysis outputs
Installation
Method 1: Using pip with conda (Recommended)
conda create -n panr_env python=3.9
conda activate panr_env
pip install panR2
Method 2: Direct installation from GitHub
conda create -n panr_env python=3.8
conda activate panr_env
pip install git+https://github.com/Tasnimul-Arabi-Anik/PanR2.git
Method 3: Manual Installation from Source
git clone https://github.com/Tasnimul-Arabi-Anik/PanR2.git
cd PanR2
pip install -r requirements.txt
Confirm Installation
panr --help
Usage
Command-Line Interface
panr --ncbi-dir <NCBI_DIRECTORY> --abricate-dir <ABRICATE_DIRECTORY> --output-dir <OUTPUT_DIRECTORY> [OPTIONS]
Required Arguments
| Argument | Description |
|---|---|
--ncbi-dir |
Path to the ncbi_clean.csv file from FetchM |
--abricate-dir |
Directory containing Abricate summary .tab or .csv files |
--output-dir |
Directory to store merged results and visualizations |
Optional Arguments
| Argument | Type | Default | Description |
|---|---|---|---|
--genep |
float | - | Minimum % gene presence to include in heatmap |
--nseq |
int | - | Minimum number of sequences required per group in heatmaps |
--format |
str | png |
Output format for figures (tiff, svg, png, pdf) |
--version |
- | - | Show program's version number and exit |
-h, --help |
- | - | Show help message and exit |
Example Usage
# Basic usage
panr --ncbi-dir ./data/ncbi_clean.csv --abricate-dir ./data/abricate --output-dir ./output
# With optional parameters
panr --ncbi-dir ./data/ncbi_clean.csv --abricate-dir ./data/abricate --output-dir ./output --format pdf --genep 10 --nseq 5
Output Structure
PanR2 generates a comprehensive set of outputs organized in the following directory structure:
output/
├── figures/
│ ├── heatmap/ # Geographic heatmaps
│ ├── html_files/ # Interactive HTML plots
│ ├── mean_ARG/ # Mean antibiotic resistance gene plots
│ ├── Stat_analysis/ # Statistical analysis results
│ ├── index.html # Main navigation page
│ └── [Various static plots]
└── [Processed CSV files]
Output Files Description
1. Static Visualizations
Resistance_gene_presence.{format}- Bar plot showing gene presence across samplesResistance_gene_percentage.{format}- Lollipop plot showing gene percentage distributionResistance_gene_identity_boxplot.{format}- Boxplot of resistance gene identity scoresResistance_percentage_by_Antibiotics.{format}- Bar plot of resistance by antibiotic classes
2. Heatmaps (heatmap/ directory)
ncbi_ncbi_Continent_heatmap.{format}- Resistance gene distribution by continentncbi_ncbi_Geographic_Location_heatmap.{format}- Distribution by geographic locationncbi_ncbi_Subcontinent_heatmap.{format}- Distribution by subcontinentncbi_ncbi_Collection_Date_heatmap.{format}- Temporal distribution patterns
3. Mean ARG Analysis (mean_ARG/ directory)
Mean_ARG_by_Continent.{format}- Average antibiotic resistance genes by continentMean_ARG_by_Geographic Location.{format}- Average ARGs by geographic locationMean_ARG_by_Subcontinent.{format}- Average ARGs by subcontinentMean_ARG_by_Collection Date.{format}- Temporal trends in ARG abundance
4. Interactive HTML Visualizations (html_files/ directory)
Resistance_gene_distribution_heatmap.html- Interactive heatmap of gene distributionResistance_gene_geographic_distribution.html- Geographic distribution mapResistance_gene_frequency_boxplot.html- Interactive frequency analysisResistance_gene_identity_boxplot.html- Interactive identity score analysisResistance_gene_presence.html- Interactive presence/absence visualizationResistance_gene_percentage.html- Interactive percentage analysisResistance_percentage_by_Antibiotics.html- Interactive antibiotic class analysisMean_Frequency_Antibiotic_Resistance_genes.html- Mean frequency analysisContinent_correlation_plot.html- Continental correlation analysisGeographic_Location_correlation_plot.html- Location-based correlationsSubcontinent_correlation_plot.html- Subcontinental correlation patterns
5. Statistical Analysis (Stat_analysis/ directory)
combined_geographic_correlation_summary.csv- Geographic correlation statisticscombined_overall_tests.csv- Overall statistical test resultscombined_pairwise_comparisons.csv- Pairwise comparison resultscombined_summary_statistics.csv- Comprehensive summary statisticsncbi_gene_presence_count_percentage.csv- Gene presence counts and percentages
6. Navigation
index.html- Interactive HTML index page for easy navigation of all generated visualizations
Example Visualizations
Static Plots
Resistance Gene Presence Analysis:
Bar plot showing the presence of resistance genes across samples
Gene Percentage Distribution:
Lollipop plot displaying gene percentage distribution
Geographic Distribution Heatmap:
Heatmap showing resistance gene distribution across continents
Antibiotic Resistance by Classes:
Bar plot showing resistance patterns by antibiotic classes
Gene Identity Analysis:
Boxplot analysis of resistance gene identity scores
Sample Output Directory Structure
figures/
├── Resistance_gene_presence.png
├── Resistance_gene_percentage.png
├── Resistance_gene_identity_boxplot.png
├── Resistance_percentage_by_Antibiotics.png
├── heatmap/
│ ├── ncbi_ncbi_Continent_heatmap.png
│ ├── ncbi_ncbi_Geographic_Location_heatmap.png
│ ├── ncbi_ncbi_Subcontinent_heatmap.png
│ └── ncbi_ncbi_Collection_Date_heatmap.png
├── mean_ARG/
│ ├── Mean_ARG_by_Continent.png
│ ├── Mean_ARG_by_Geographic Location.png
│ ├── Mean_ARG_by_Subcontinent.png
│ └── Mean_ARG_by_Collection Date.png
└── html_files/
├── index.html (Main navigation page)
└── [Interactive HTML plots - open in browser]
Interactive Features
The tool generates interactive HTML visualizations that provide enhanced data exploration capabilities:
- Dynamic Interaction: Zooming, panning, and selection tools
- Detailed Tooltips: Hover for comprehensive data information
- Filtering Options: Dynamic data filtering and highlighting
- Export Capabilities: High-quality image export functionality
- Responsive Design: Optimized for different screen sizes
To Access Interactive Plots:
- Navigate to your output directory
- Open
figures/index.htmlin a web browser - Click on any visualization link to explore interactive plots
Available Interactive Visualizations:
- Geographic distribution maps with zoom capabilities
- Interactive heatmaps with data filtering
- Dynamic correlation plots with hover details
- Responsive boxplots and bar charts
- Time-series analysis with date selection
Note: Interactive HTML files must be opened in a web browser to view. GitHub README cannot display interactive HTML content directly.
Statistical Analysis Features
PanR2 provides comprehensive statistical analysis including:
- Correlation Analysis: Geographic and temporal correlations
- Comparative Statistics: Between-group comparisons
- Summary Statistics: Descriptive statistics for all variables
- Pairwise Comparisons: Detailed pairwise statistical tests
- Geographic Patterns: Spatial distribution analysis
Requirements
- Python 3.8+
- Required Python packages (automatically installed):
- pandas
- numpy
- matplotlib
- seaborn
- plotly
- scipy
- Other dependencies listed in
requirements.txt
Contributing
Contributions are welcome! Please feel free to submit a Pull Request or open an Issue for bugs, feature requests, or improvements.
License
This tool is provided under the MIT License. See LICENSE file for details.
Citation
If you use PanR2 in your research, please cite:
PanR2: A comprehensive tool for panresistome analysis and visualization
Author: Tasnimul Arabi Anik
GitHub: https://github.com/Tasnimul-Arabi-Anik/PanR2
Support
For questions, issues, or feature requests, please:
- Check the existing Issues
- Create a new issue with detailed information
- Contact the author: Tasnimul Arabi Anik
Changelog
Latest Updates
- Added interactive HTML visualizations
- Enhanced statistical analysis capabilities
- Improved output organization and navigation
- Added support for multiple figure formats
- Enhanced correlation analysis features
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file panr2-0.1.2.tar.gz.
File metadata
- Download URL: panr2-0.1.2.tar.gz
- Upload date:
- Size: 26.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9c7a604ee1dd9ee689531d14a543b6dace330acdd111c5842e1583184d3deeb9
|
|
| MD5 |
7cdbc84a8b24e37d683bf9aa16c358c9
|
|
| BLAKE2b-256 |
3aac54c23bf1dc7661fa0ecad6e62a54719b4e1c471b79a9ad47790db7256e2e
|
File details
Details for the file panr2-0.1.2-py3-none-any.whl.
File metadata
- Download URL: panr2-0.1.2-py3-none-any.whl
- Upload date:
- Size: 26.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d01a930403b038cafaf4046cbd445480db9d9a366b94f2bb7b5a93c186fa6fca
|
|
| MD5 |
549a3baf3530a2fbbe13279eb9ee4579
|
|
| BLAKE2b-256 |
5ef2ef141d8d8193d8eadb8ea7b118710f064b919a5f21ede79a7c703e10e548
|