A Python library for tornado chart generation and analysis
Project description
TornadoPy
A Python library for tornado chart generation and analysis. TornadoPy provides tools for processing Excel-based tornado data and generating professional tornado charts for uncertainty analysis.
Features
-
TornadoProcessor: Process Excel files containing tornado analysis data
- Parse multi-sheet Excel files with complex headers
- Extract and compute statistics (p90p10, mean, median, minmax, percentiles)
- Filter data by properties and dynamic fields
- Case selection with weighted criteria
- Batch processing for multiple parameters
- Optimized for performance with native numpy operations
-
tornado_plot: Generate professional tornado charts
- Customizable colors, fonts, and styling
- Support for p90/p10 ranges with automatic label placement
- Reference case lines
- Custom parameter ordering
- Export to various image formats
-
distribution_plot: Generate distribution histograms with cumulative curves
- Beautiful bin sizing with round numbers
- Cumulative distribution curve showing % of cases above value
- P90/P50/P10 percentile markers and subtitle
- Optional reference case line
- Multiple color schemes available
- Export to various image formats
Installation
Install from PyPI:
pip install tornadopy
Quick Start
Processing Tornado Data
from tornadopy import TornadoProcessor
# Load Excel file with tornado data
processor = TornadoProcessor("tornado_data.xlsb")
# Get available parameters
parameters = processor.get_parameters()
print(f"Parameters: {parameters}")
# Get properties for a parameter
properties = processor.get_properties(parameter="Parameter1")
print(f"Properties: {properties}")
# Compute statistics
result = processor.compute(
stats="p90p10",
parameter="Parameter1",
filters={"property": "npv"},
multiplier=1e-6 # Convert to millions
)
print(f"P90/P10: {result['p90p10']}")
Generating Tornado Charts
from tornadopy import TornadoProcessor, tornado_plot
# Get tornado data
processor = TornadoProcessor("tornado_data.xlsb")
tornado_data = processor.get_tornado_data(
parameters="all",
filters={"property": "npv"},
multiplier=1e-6
)
# Convert to sections format for plotting
sections = []
for param, data in tornado_data.items():
sections.append({
"parameter": param,
"minmax": [data["p10"], data["p90"]],
"p90p10": [data["p10"], data["p90"]]
})
# Generate tornado chart
fig, ax, saved = tornado_plot(
sections=sections,
title="NPV Tornado Chart",
subtitle="Base case = 100.0 MM USD",
base=100.0,
unit="MM USD",
outfile="tornado_chart.png"
)
Generating Distribution Charts
from tornadopy import TornadoProcessor, distribution_plot
# Get distribution data
processor = TornadoProcessor("tornado_data.xlsb")
distribution = processor.get_distribution(
parameter="Parameter1",
filters={"property": "npv"},
multiplier=1e-6
)
# Generate distribution chart
fig, ax, saved = distribution_plot(
distribution,
title="NPV Distribution",
unit="MM USD",
color="blue",
reference_case=100.0,
outfile="npv_distribution.png"
)
Advanced Usage
Multi-Zone Analysis with Batch Processing
Process multiple parameters at once with zone filtering and custom options:
from tornadopy import TornadoProcessor, tornado_plot
processor = TornadoProcessor("reservoir_data.xlsb")
# Compute statistics for all parameters with zone filtering
results = processor.compute_batch(
stats=["minmax", "p90p10"],
parameters="all",
filters={
"zones": ["Zone A - Reservoir", "Zone B - Reservoir"],
"property": "STOIIP"
},
multiplier=1e-3, # Convert to thousands
options={
"p90p10_threshold": 150, # Minimum cases required
"skip": ["sources"] # Skip source tracking for cleaner output
}
)
# Convert results to tornado plot format
sections = []
for result in results:
if "p90p10" in result and "errors" not in result:
p10, p90 = result["p90p10"]
sections.append({
"parameter": result["parameter"],
"minmax": result.get("minmax", [p10, p90]),
"p90p10": [p10, p90]
})
# Generate tornado chart
fig, ax, saved = tornado_plot(
sections,
title="STOIIP Tornado - Multi-Zone Analysis",
base=14.5, # Base case value
reference_case=14.2, # Reference case line
unit="MM m³",
outfile="stoiip_tornado.svg"
)
Distribution Plot with Custom Gridlines
Create distribution charts with percentile markers and custom grid settings:
from tornadopy import TornadoProcessor, distribution_plot
processor = TornadoProcessor("reservoir_data.xlsb")
# Get distribution data for specific zones
distribution = processor.get_distribution(
parameter="Uncertainty_Analysis",
filters={
"zones": ["Zone A - Reservoir", "Zone B - Reservoir"],
"property": "STOIIP"
},
multiplier=1e-3 # Convert to thousands
)
# Generate distribution chart with custom settings
fig, ax, saved = distribution_plot(
data=distribution,
title="STOIIP Distribution - Uncertainty Analysis",
unit="MM m³",
color="blue",
reference_case=14.5,
target_bins=20,
settings={
"show_percentile_markers": True, # Show P90/P50/P10 markers
"marker_size": 8,
"show_minor_grid": True,
# Custom gridline intervals
"x_major_interval": 5, # Major x-gridlines every 5 units
"x_minor_interval": 1, # Minor x-gridlines every 1 unit
"y_major_interval": 50, # Major y-gridlines every 50 frequency
"y_minor_interval": 10, # Minor y-gridlines every 10 frequency
},
outfile="stoiip_distribution.svg"
)
Working with Multiple Properties
Analyze multiple properties simultaneously:
# Compute statistics for multiple properties
result = processor.compute(
stats=["p90p10", "mean", "median"],
parameter="Reservoir_Model",
filters={
"zones": ["Main_Reservoir"],
"property": ["STOIIP", "GIIP"] # Multiple properties
},
multiplier=1e-6 # Convert to millions
)
# Access results by property
stoiip_p90, stoiip_p10 = result["p90p10"][0] # First property (STOIIP)
giip_p90, giip_p10 = result["p90p10"][1] # Second property (GIIP)
print(f"STOIIP P90/P10: {stoiip_p90:.2f} / {stoiip_p10:.2f} MM m³")
print(f"GIIP P90/P10: {giip_p90:.2f} / {giip_p10:.2f} bcm")
Case Selection with Weighted Criteria
Find specific cases that match target percentiles:
# Find closest cases to p90/p10 with custom weights
result = processor.compute(
stats="p90p10",
parameter="Reservoir_Model",
filters={
"zones": ["Main_Reservoir"],
"property": "STOIIP"
},
multiplier=1e-6,
case_selection=True, # Enable case selection
selection_criteria={
"weights": {"STOIIP": 0.6, "GIIP": 0.4} # Weighted criteria
}
)
# Access closest cases
for case in result["closest_cases"]:
print(f"Case {case['case']}: index={case['idx']}, STOIIP={case['STOIIP']:.2f}")
print(f" Properties: {case['properties']}")
Skipping Specific Parameters
Exclude certain parameters from batch processing:
# Process all parameters except specific ones
results = processor.compute_batch(
stats="p90p10",
parameters="all",
filters={"property": "STOIIP"},
multiplier=1e-3,
options={
"skip_parameters": ["Reference_Case", "Full_Uncertainty"], # Skip these
"skip": ["sources", "errors"] # Skip these fields in output
}
)
Custom Tornado Chart Styling
Full control over chart appearance:
# Custom styling for professional reports
settings = {
"figsize": (12, 8),
"dpi": 200,
"pos_dark": "#1E88E5", # Blue for positive
"neg_dark": "#D32F2F", # Red for negative
"show_values": ["min", "max", "p10", "p90"],
"show_percentage_diff": True,
}
fig, ax, saved = tornado_plot(
sections=sections,
title="Reservoir Volume Sensitivity Analysis",
subtitle="Base Case: 100 MM m³",
base=100.0,
reference_case=95.0,
unit="MM m³",
preferred_order=["Porosity", "NTG", "Area"], # Custom parameter order
settings=settings,
outfile="sensitivity_analysis.png"
)
Common Workflows
Complete Reservoir Uncertainty Analysis
End-to-end workflow for reservoir analysis with tornado and distribution charts:
from tornadopy import TornadoProcessor, tornado_plot, distribution_plot
import matplotlib.pyplot as plt
# Load data
processor = TornadoProcessor("reservoir_uncertainty.xlsb")
# Define common filters
zones = ["Main Reservoir - SST1", "Main Reservoir - SST2"]
multiplier = 1e-3 # Convert to thousands
# 1. Generate STOIIP Tornado Chart
stoiip_results = processor.compute_batch(
stats=["minmax", "p90p10"],
parameters="all",
filters={
"zones": zones,
"property": "STOIIP"
},
multiplier=multiplier,
options={
"p90p10_threshold": 150,
"skip_parameters": ["Reference_Case", "Full_Uncertainty"]
}
)
# Convert to tornado format
sections = []
for result in stoiip_results:
if "p90p10" in result and "errors" not in result:
p10, p90 = result["p90p10"]
min_val, max_val = result.get("minmax", [p10, p90])
sections.append({
"parameter": result["parameter"],
"minmax": [min_val, max_val],
"p90p10": [p10, p90]
})
# Create tornado chart
fig1, ax1, saved1 = tornado_plot(
sections,
title="STOIIP Sensitivity Analysis",
base=14.5,
reference_case=14.2,
unit="MM m³",
outfile="stoiip_tornado.svg"
)
# 2. Generate Distribution Chart
distribution = processor.get_distribution(
parameter="Full_Uncertainty",
filters={
"zones": zones,
"property": "STOIIP"
},
multiplier=multiplier
)
fig2, ax2, saved2 = distribution_plot(
data=distribution,
title="STOIIP Distribution - Full Uncertainty",
unit="MM m³",
color="blue",
reference_case=14.5,
settings={
"show_percentile_markers": True,
"x_major_interval": 5,
"x_minor_interval": 1,
},
outfile="stoiip_distribution.svg"
)
# Show both charts
plt.show()
print(f"Charts saved: {saved1}, {saved2}")
Comparing Multiple Scenarios
Compare different reservoir scenarios side by side:
from tornadopy import TornadoProcessor, distribution_plot
import matplotlib.pyplot as plt
import numpy as np
processor = TornadoProcessor("scenarios.xlsb")
# Define scenarios
scenarios = [
{"name": "Base Case", "param": "Base_Case", "color": "blue"},
{"name": "Optimistic", "param": "Optimistic", "color": "green"},
{"name": "Pessimistic", "param": "Pessimistic", "color": "red"},
]
# Create subplots for comparison
fig, axes = plt.subplots(1, 3, figsize=(18, 6))
for idx, scenario in enumerate(scenarios):
dist = processor.get_distribution(
parameter=scenario["param"],
filters={"property": "NPV"},
multiplier=1e-6
)
distribution_plot(
data=dist,
title=f"{scenario['name']} Scenario",
unit="MM USD",
color=scenario["color"],
target_bins=15,
outfile=None # Don't save individual plots
)
# Move the plot to the subplot
plt.close()
plt.tight_layout()
plt.savefig("scenario_comparison.png", dpi=200)
plt.show()
Tips and Best Practices
Working with Filters
Zone Filtering:
# Single zone
filters = {"zones": "Main Reservoir", "property": "STOIIP"}
# Multiple zones (will sum values across zones)
filters = {"zones": ["Zone A", "Zone B"], "property": "STOIIP"}
Property Filtering:
# Single property
filters = {"property": "STOIIP"}
# Multiple properties (returns separate results for each)
filters = {"property": ["STOIIP", "GIIP"]}
Using Multipliers
Convert units easily with the multiplier parameter:
# Convert to thousands (mcm → MM m³)
multiplier = 1e-3
# Convert to millions (m³ → MM m³)
multiplier = 1e-6
# Convert to billions (m³ → bcm)
multiplier = 1e-9
Skipping Parameters
Exclude specific parameters from batch processing:
options = {
"skip_parameters": ["Reference_Case", "Full_Uncertainty"], # Skip these parameters
"skip": ["sources", "errors"] # Skip these fields in results
}
Handling Errors
results = processor.compute_batch(
stats="p90p10",
parameters="all",
filters={"property": "STOIIP"},
options={"skip": ["errors"]} # Hide error messages
)
# Check for errors in results
for result in results:
if "errors" in result:
print(f"Parameter {result['parameter']} had errors: {result['errors']}")
elif "p90p10" in result:
print(f"Parameter {result['parameter']}: P90/P10 = {result['p90p10']}")
Performance Tips
-
Use batch processing for multiple parameters:
# Good: Single call for all parameters results = processor.compute_batch(stats="p90p10", parameters="all", ...) # Avoid: Multiple calls for param in parameters: result = processor.compute(stats="p90p10", parameter=param, ...)
-
Skip unnecessary data:
options = { "skip": ["sources", "errors"], # Reduces memory usage }
-
Set appropriate thresholds:
options = { "p90p10_threshold": 150, # Require minimum cases for reliable statistics }
Excel File Format
TornadoPy expects Excel files with the following structure:
[Info rows - optional metadata]
Header Row 1 | Dynamic Field 1 | Dynamic Field 1 | ...
Header Row 2 | Value A | Value B | ...
Case | Property 1 | Property 2 | ...
1 | 123.45 | 67.89 | ...
2 | 234.56 | 78.90 | ...
...
- Multiple header rows are supported and will be combined
- The "Case" row marks the start of data
- Dynamic fields in column A define metadata columns
- Property names are extracted from the last header row
API Reference
TornadoProcessor
Methods
get_parameters(): Get list of available parameters (sheet names)get_properties(parameter): Get available properties for a parameterget_unique(field, parameter): Get unique values for a dynamic fieldget_info(parameter): Get metadata for a parameterget_case(index, parameter): Get data for a specific casecompute(stats, parameter, filters, multiplier, options, case_selection, selection_criteria): Compute statisticscompute_batch(...): Batch compute for multiple parametersget_tornado_data(...): Get tornado chart formatted data
tornado_plot
Parameters
sections: List of section dictionaries with parameter datatitle: Chart titlesubtitle: Chart subtitleoutfile: Output file pathbase: Base case valuereference_case: Reference case line valueunit: Unit labelpreferred_order: List of parameter names for custom orderingsettings: Dictionary of visual settings
Returns
fig: Matplotlib figure objectax: Matplotlib axes objectsaved: Path to saved file (if outfile specified)
distribution_plot
Parameters
data: Array-like data (numpy array, list, or from get_distribution)title: Chart title (default "Distribution")unit: Unit label for x-axis and subtitleoutfile: Output file path (if specified, saves the figure)target_bins: Target number of bins for histogram (default 20)color: Color scheme - "red", "blue", "green", "orange", "purple", "fuchsia", "yellow"reference_case: Optional reference case value to plot as vertical linesettings: Dictionary of visual settings to override defaults
Settings Options
Common settings for customizing distribution plots:
settings = {
# Layout
"figsize": (10, 6),
"dpi": 160,
# Percentile markers
"show_percentile_markers": True, # Show P90/P50/P10 on cumulative curve
"marker_size": 8,
# Grid customization
"show_minor_grid": True,
"x_major_interval": 5, # Major x-gridlines every 5 units
"x_minor_interval": 1, # Minor x-gridlines every 1 unit
"y_major_interval": 50, # Major y-gridlines every 50 frequency
"y_minor_interval": 10, # Minor y-gridlines every 10 frequency
# Font sizes
"title_fontsize": 15,
"subtitle_fontsize": 11,
"label_fontsize": 10,
}
Returns
fig: Matplotlib figure objectax: Matplotlib axes object (primary)saved: Path to saved file (if outfile specified)
Requirements
- Python >= 3.9
- numpy >= 1.20.0
- polars >= 0.18.0
- fastexcel >= 0.9.0
- matplotlib >= 3.5.0
License
MIT License - see LICENSE file for details
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
Support
For issues and questions, please open an issue on GitHub.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file tornadopy-0.1.7.tar.gz.
File metadata
- Download URL: tornadopy-0.1.7.tar.gz
- Upload date:
- Size: 31.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a3bc7bcec529febb123447885707640bba36da17b9152d20bcabfe96059ae353
|
|
| MD5 |
0cf64c2432f30ca955a503a3a552e541
|
|
| BLAKE2b-256 |
d27394dbd6045f9dc6cc29ab7c7631073afa9853ba4941fd1362fb3b9f7a2655
|
File details
Details for the file tornadopy-0.1.7-py3-none-any.whl.
File metadata
- Download URL: tornadopy-0.1.7-py3-none-any.whl
- Upload date:
- Size: 25.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a54c614de357444499c1f395930b70b079189ab7b4c8a69559b17c4a0102d616
|
|
| MD5 |
242ca7f37c4195bf0bfbca72486ff6ac
|
|
| BLAKE2b-256 |
bbdd8e2537ff64a93023ce2a7b9467a263f28aa692cd8877775082948b29760e
|