Skip to main content

EIR Estimation using Machine learning INTerventions - Python port

Project description

estiMINT (Python)

Python port of the estiMINT R package for EIR (Entomological Inoculation Rate) estimation using machine learning.

Installation

pip install -e .

Or install dependencies directly:

pip install -r requirements.txt

File Mapping (R → Python)

R File Python File Description
estiMINT-package.R __init__.py Package initialization and exports
globals.R globals.py Global variables and constants
utils.R utils.py Utility functions (metrics, QMAP, etc.)
data_processing.R data_processing.py Data loading and preprocessing
models.R models.py XGBoost model training
train.R train.py Main training pipeline with K-fold CV
plotting.R plotting.py Visualization functions
storage.R storage.py Model persistence and loading
run.R run.py Model inference

API Reference

Training

from estimint import train_xgb_model

model = train_xgb_model(
    in_parquet="data/input.parquet",
    out_dir="output/",
    thr_lo=0.02,           # Lower prevalence threshold
    thr_hi=0.95,           # Upper prevalence threshold
    k_strata=16,           # K-means strata for EIR
    K=10,                  # CV folds
    seed=42,
    save_pkl=True,
    save_plots=True,
    save_artifacts=True
)

Inference

from estimint import load_xgb_model, run_xgb_model
import pandas as pd

# Load model
model = load_xgb_model("output/models/estiMINT_model.pkl")

# Prepare input data
new_data = pd.DataFrame({
    "dn0_use": [0.5],
    "Q0": [0.3],
    "phi_bednets": [0.6],
    "seasonal": [1],
    "itn_use": [0.7],
    "irs_use": [0.2],
    "prev_y9": [0.15]  # or "prevalence"
})

# Run prediction
eir_predictions = run_xgb_model(new_data, model)
print(f"Predicted EIR: {eir_predictions[0]:.2f}")

Using Global Model

from estimint import load_xgb_model, run_xgb_model, set_global_model

# Set global model once
model = load_xgb_model("output/models/estiMINT_model.pkl")
set_global_model(model)

# Run predictions without passing model
predictions = run_xgb_model(new_data)  # Uses global model

Utility Functions

from estimint import (
    r2, rmse, mse, mae, median_ae, mae_rel, rmsle, smape,
    fit_qmap_w, predict_qmap_w, scale_pos
)

# Calculate metrics
y_true = [1, 2, 3, 4, 5]
y_pred = [1.1, 2.2, 2.9, 4.1, 4.8]

print(f"R²: {r2(y_true, y_pred):.4f}")
print(f"RMSE: {rmse(y_true, y_pred):.4f}")
print(f"MAE: {mae(y_true, y_pred):.4f}")

# Quantile mapping calibration
cal = fit_qmap_w(y_pred, y_true)
y_calibrated = predict_qmap_w(y_pred, cal)

Data Processing

from estimint import load_and_filter, make_value_weights, strata_and_split

# Load and filter parquet data
result = load_and_filter("data.parquet", thr_lo=0.02, thr_hi=0.95)
df = result["DT"]
df_excluded = result["DT_excluded"]

# Create inverse-frequency weights
weights = make_value_weights(df["eir"].values, digits=3)

# Stratified split
df["eir_log10"] = np.log10(df["eir"])
df = strata_and_split(df, k_strata=16, seed=42)

Key Differences from R Version

  1. File format: Models saved as .pkl (pickle) instead of .rds
  2. Data handling: Uses pandas instead of data.table
  3. Plotting: Uses matplotlib instead of ggplot2
  4. Global model: Use set_global_model() / get_global_model() instead of .GlobalEnv

Dependencies

  • numpy >= 1.20.0
  • pandas >= 1.3.0
  • duckdb >= 0.8.0
  • xgboost >= 1.6.0
  • scikit-learn >= 1.0.0
  • matplotlib >= 3.4.0
  • requests >= 2.28.0 (optional, for model download)
  • appdirs >= 1.4.0 (optional, for cache directory)

License

MIT License

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

estimint-1.1.0.tar.gz (6.4 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

estimint-1.1.0-py3-none-any.whl (6.5 MB view details)

Uploaded Python 3

File details

Details for the file estimint-1.1.0.tar.gz.

File metadata

  • Download URL: estimint-1.1.0.tar.gz
  • Upload date:
  • Size: 6.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for estimint-1.1.0.tar.gz
Algorithm Hash digest
SHA256 1de21bacd1c2f7fc1a1df58f8bb491677fc9edf228208c4ace7f71cb26e0ebbd
MD5 4ffe12881958a3147c88b8276cd094d8
BLAKE2b-256 4eb7b8717048667e9395e0f2d1082f2d4691bee8a2d7f96b4c9fd17a6e2d6178

See more details on using hashes here.

File details

Details for the file estimint-1.1.0-py3-none-any.whl.

File metadata

  • Download URL: estimint-1.1.0-py3-none-any.whl
  • Upload date:
  • Size: 6.5 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for estimint-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 cbbddf07b476f169171a68fc75eb9bd4b9ebcf2b0bee007ece50d62983b9b186
MD5 abfb41fd2758cd6159ed7fd715c196b3
BLAKE2b-256 b76e72f9ded58d539668d43a3fd70f1e7ad64414a73199735d744747ae7ed0c0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page