Skip to main content

miToolsPro is a collection of tools for data analysis and visualization.

Project description

miToolsPro

miToolsPro

Python Version License: MIT Code Coverage

A comprehensive Python toolkit for data analysis, visualization, and research workflows. Features 17 specialized modules covering plotting, econometric modeling, clustering analysis, economic complexity, document processing, and modern API integrations.

Installation

pip install mitoolspro

Requirements: Python 3.12+

Quick Start

from mitoolspro.plotting import LinePlotter
from mitoolspro.clustering import kmeans_clustering
import numpy as np

# Create sample data
data = np.random.rand(100, 2)

# Create a line plot
plotter = LinePlotter(x_data=data[:, 0], y_data=data[:, 1])
plotter.plot()

# Perform clustering
model, labels = kmeans_clustering(data, n_clusters=3)

Core Modules

📊 Plotting (mitoolspro.plotting)

Professional-grade plotting with matplotlib compatibility, type validation, and composable architecture.

Available Plotters:

  • BarPlotter, BoxPlotter, ScatterPlotter, LinePlotter
  • HistogramPlotter, PiePlotter, SankeyPlotter
  • DistributionPlotter, ErrorPlotter

Features:

  • Type-safe parameter validation with Pydantic
  • Plot composition with PlotComposer
  • Mixin architecture for shared functionality
  • matplotlib API compatibility
from mitoolspro.plotting import PlotComposer, BarPlotter, LinePlotter

# Create individual plots
bar = BarPlotter(x_data=['A', 'B', 'C'], y_data=[1, 2, 3])
line = LinePlotter(x_data=[1, 2, 3], y_data=[1, 4, 2])

# Compose multiple plots
composer = PlotComposer()
composer.add_plot(bar, position=(0, 0))
composer.add_plot(line, position=(0, 1))
composer.compose(figsize=(12, 6))

📈 Econometric Analysis (mitoolspro.regressions)

Professional econometric modeling with comprehensive diagnostic tools.

Model Types:

  • OLS: Ordinary Least Squares with robust standard errors
  • Panel: Fixed/Random effects, time series cross-sectional data
  • IV: Instrumental Variables, Two-Stage Least Squares
  • Quantile: Quantile regression across multiple quantiles
  • Regime: Regime switching and structural break models
  • Factor: Multi-factor models and principal components
from mitoolspro.regressions.linear_models import OLSModel
import pandas as pd

# Load your data
data = pd.read_csv('your_data.csv')

# Fit OLS model
model = OLSModel(
    data=data,
    dependent_variable='price',
    independent_variables=['size', 'location'],
    control_variables=['year']
)
results = model.fit()
print(results.results.summary())

Advanced Quantile Regression:

from mitoolspro.regressions.managers import QuantilesRegression
from mitoolspro.regressions.wrappers.linear_models import QuantilesRegressionSpecs

# Define regression specifications
specs = QuantilesRegressionSpecs(
    dataframe=data,
    dependent_variable='price',
    independent_variables=['size', 'location']
)

# Run quantile regression across multiple quantiles
qr = QuantilesRegression(specs)
results = qr.fit(quantiles=[0.25, 0.5, 0.75])

🎯 Clustering Analysis (mitoolspro.clustering)

Complete clustering pipeline with evaluation metrics and visualization.

Algorithms:

  • K-means clustering with automatic optimization
  • Agglomerative hierarchical clustering
  • Automatic cluster number detection

Evaluation & Visualization:

  • Silhouette analysis and scoring
  • Centroid calculation and visualization
  • Distance metrics and similarity measures
  • Growth analysis and cluster size tracking
from mitoolspro.clustering import clustering_ncluster_search, plot_clusters_growth
import numpy as np

# Generate sample data
data = np.random.rand(200, 4)

# Find optimal number of clusters
best_n, results = clustering_ncluster_search(data, n_range=(2, 10))
print(f"Optimal clusters: {best_n}")

# Visualize cluster analysis
plot_clusters_growth(results)

🌍 Economic Complexity Analysis (mitoolspro.economic_complexity)

Advanced tools for trade analysis and economic complexity calculations.

Core Functions:

  • ECI/PCI Calculation: Economic and Product Complexity Indices
  • RCA Analysis: Revealed Comparative Advantage matrices
  • Proximity Networks: Product and country similarity analysis
  • GPU Acceleration: PyTorch integration for large datasets
from mitoolspro.economic_complexity import (
    calculate_economic_complexity,
    calculate_proximity_matrix,
    exports_data_to_matrix
)
import pandas as pd

# Process trade data
trade_data = pd.read_csv('trade_data.csv')
rca_matrix = calculate_exports_matrix_rca(trade_data, 'country', 'product', 'value')

# Calculate complexity indices
eci, pci = calculate_economic_complexity(rca_matrix.values)

# Build proximity networks
proximity = calculate_proximity_matrix(rca_matrix.values)

🤖 LLM Integration (mitoolspro.llms)

Production-ready LLM clients with usage tracking and cost management.

Supported Providers:

  • OpenAI: GPT models with structured output support
  • Ollama: Local LLM deployment and management

Features:

  • Token usage tracking and cost calculation
  • Persistent usage history across sessions
  • Model registry with pricing information
  • Beta features support (structured outputs)
from mitoolspro.llms import OpenAIClient, PersistentTokensCounter

# Set up usage tracking
counter = PersistentTokensCounter(
    file_path="usage.json", 
    source="openai", 
    model="gpt-4o-mini"
)

# Create client with usage tracking
client = OpenAIClient(
    api_key="your-api-key",
    model="gpt-4o-mini",
    counter=counter
)

# Make requests with automatic cost tracking
response = client.request("Analyze this data trend...")
print(f"Total cost: ${counter.calculate_total_cost():.4f}")

📄 Document Processing (mitoolspro.document)

Extract and analyze content from PDF and DOCX documents.

PDF Processing:

  • Text extraction with layout preservation
  • Document structure analysis (pages, blocks, lines)
  • Font and formatting detection
  • Metadata extraction

Document Generation:

  • DOCX file creation and manipulation
  • Text styling and formatting
  • Table and image insertion
from mitoolspro.document import pdf_to_document
from mitoolspro.document.write_document import create_docx_document

# Extract structured content from PDF
document = pdf_to_document("report.pdf")
for page in document.pages:
    print(f"Page {page.page_number}: {len(page.lines)} lines")

# Create new DOCX document
create_docx_document(
    filename="output.docx",
    title="Analysis Report",
    content_blocks=[("heading", "Results"), ("paragraph", "Analysis complete.")]
)

Specialized Modules

🌐 Google API Integration (mitoolspro.google_utils)

Places API:

  • Location search and analysis
  • Geospatial data processing
  • Business saturation studies

YouTube API:

  • Video download and conversion
  • Metadata extraction
  • Batch processing workflows

🕸️ Networks (mitoolspro.networks)

Interactive network visualization with pyvis integration.

🗄️ Databases (mitoolspro.databases)

SQLAlchemy and SQLite utilities for data persistence.

📁 Files (mitoolspro.files)

Multi-format file handlers: Excel, PDF, ICS, and document conversion.

🔤 NLP (mitoolspro.nlp)

Text processing with spaCy and Transformers integration.

🕷️ Scraping (mitoolspro.scraping)

Web scraping tools with Selenium automation.

🛠️ Utilities (mitoolspro.utils)

Development tools, decorators, and context managers.

Example Notebooks

Comprehensive examples in the examples/ directory:

Plotting: bar_plotter.ipynb, composer.ipynb

Analysis: clustering.ipynb, networks.ipynb

Regression: ols.ipynb, ivars.ipynb

Development

# Clone and install for development
git clone https://github.com/montanon/miToolsPro.git
cd miToolsPro
uv sync --group dev

# Run tests with coverage
uv run pytest tests/ --cov=mitoolspro

# Generate coverage report
uv run coverage html

Technical Details

  • 77% Test Coverage across 94 test files
  • Type Safety with comprehensive annotations
  • Exception Handling with 83 custom exception classes
  • Modern Architecture with lazy loading and abstract base classes
  • Performance optimized with parallel processing support

Dependencies

Core Stack:

  • Data: pandas, numpy, scipy
  • Visualization: matplotlib, seaborn, plotly
  • ML/Stats: scikit-learn, statsmodels, torch
  • Documents: pymupdf, python-docx, pdfminer
  • Web: seleniumbase, requests

Full dependency list: See pyproject.toml

License

MIT License - see LICENSE file.

Copyright (c) 2025 Sebastián Montagna

Support

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mitoolspro-1.0.1.tar.gz (160.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mitoolspro-1.0.1-py3-none-any.whl (210.5 kB view details)

Uploaded Python 3

File details

Details for the file mitoolspro-1.0.1.tar.gz.

File metadata

  • Download URL: mitoolspro-1.0.1.tar.gz
  • Upload date:
  • Size: 160.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.8.12

File hashes

Hashes for mitoolspro-1.0.1.tar.gz
Algorithm Hash digest
SHA256 b9c4f685cf67ee332dc88365cd47ac9100acf971c42909f471cda503ffdbeb5a
MD5 4eccb147a104545bc28f57ac13c332e4
BLAKE2b-256 8bf1838ad11b40f5d3b68d0daab6978ea2b02e1455c9f56b55c35e6a2cb89fc0

See more details on using hashes here.

File details

Details for the file mitoolspro-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: mitoolspro-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 210.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.8.12

File hashes

Hashes for mitoolspro-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 ec2bb7c31a418e8f9e55b22a2da098bfb09b157cc0525bf539b5eff52c19c0f0
MD5 63dedb0f2cccc0af280622d0cc48dfdc
BLAKE2b-256 a794354434a2b87767f7aa66fc895b4faa1140bfe5e4328e7de3f1fad6faa209

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page