Skip to main content

EIR Estimation using Machine learning INTerventions - Python port

Project description

estiMINT (Python)

Python port of the estiMINT R package for EIR (Entomological Inoculation Rate) estimation using machine learning.

Installation

pip install -e .

Or install dependencies directly:

pip install -r requirements.txt

File Mapping (R → Python)

R File Python File Description
estiMINT-package.R __init__.py Package initialization and exports
globals.R globals.py Global variables and constants
utils.R utils.py Utility functions (metrics, QMAP, etc.)
data_processing.R data_processing.py Data loading and preprocessing
models.R models.py XGBoost model training
train.R train.py Main training pipeline with K-fold CV
plotting.R plotting.py Visualization functions
storage.R storage.py Model persistence and loading
run.R run.py Model inference

API Reference

Training

from estimint import train_xgb_model

model = train_xgb_model(
    in_parquet="data/input.parquet",
    out_dir="output/",
    thr_lo=0.02,           # Lower prevalence threshold
    thr_hi=0.95,           # Upper prevalence threshold
    k_strata=16,           # K-means strata for EIR
    K=10,                  # CV folds
    seed=42,
    save_pkl=True,
    save_plots=True,
    save_artifacts=True
)

Inference

from estimint import load_xgb_model, run_xgb_model
import pandas as pd

# Load model
model = load_xgb_model("output/models/estiMINT_model.pkl")

# Prepare input data
new_data = pd.DataFrame({
    "dn0_use": [0.5],
    "Q0": [0.3],
    "phi_bednets": [0.6],
    "seasonal": [1],
    "itn_use": [0.7],
    "irs_use": [0.2],
    "prev_y9": [0.15]  # or "prevalence"
})

# Run prediction
eir_predictions = run_xgb_model(new_data, model)
print(f"Predicted EIR: {eir_predictions[0]:.2f}")

Using Global Model

from estimint import load_xgb_model, run_xgb_model, set_global_model

# Set global model once
model = load_xgb_model("output/models/estiMINT_model.pkl")
set_global_model(model)

# Run predictions without passing model
predictions = run_xgb_model(new_data)  # Uses global model

Utility Functions

from estimint import (
    r2, rmse, mse, mae, median_ae, mae_rel, rmsle, smape,
    fit_qmap_w, predict_qmap_w, scale_pos
)

# Calculate metrics
y_true = [1, 2, 3, 4, 5]
y_pred = [1.1, 2.2, 2.9, 4.1, 4.8]

print(f"R²: {r2(y_true, y_pred):.4f}")
print(f"RMSE: {rmse(y_true, y_pred):.4f}")
print(f"MAE: {mae(y_true, y_pred):.4f}")

# Quantile mapping calibration
cal = fit_qmap_w(y_pred, y_true)
y_calibrated = predict_qmap_w(y_pred, cal)

Data Processing

from estimint import load_and_filter, make_value_weights, strata_and_split

# Load and filter parquet data
result = load_and_filter("data.parquet", thr_lo=0.02, thr_hi=0.95)
df = result["DT"]
df_excluded = result["DT_excluded"]

# Create inverse-frequency weights
weights = make_value_weights(df["eir"].values, digits=3)

# Stratified split
df["eir_log10"] = np.log10(df["eir"])
df = strata_and_split(df, k_strata=16, seed=42)

Key Differences from R Version

  1. File format: Models saved as .pkl (pickle) instead of .rds
  2. Data handling: Uses pandas instead of data.table
  3. Plotting: Uses matplotlib instead of ggplot2
  4. Global model: Use set_global_model() / get_global_model() instead of .GlobalEnv

Dependencies

  • numpy >= 1.20.0
  • pandas >= 1.3.0
  • duckdb >= 0.8.0
  • xgboost >= 1.6.0
  • scikit-learn >= 1.0.0
  • matplotlib >= 3.4.0
  • requests >= 2.28.0 (optional, for model download)
  • appdirs >= 1.4.0 (optional, for cache directory)

License

MIT License

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

estimint-1.3.1.tar.gz (15.6 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

estimint-1.3.1-py3-none-any.whl (15.9 MB view details)

Uploaded Python 3

File details

Details for the file estimint-1.3.1.tar.gz.

File metadata

  • Download URL: estimint-1.3.1.tar.gz
  • Upload date:
  • Size: 15.6 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for estimint-1.3.1.tar.gz
Algorithm Hash digest
SHA256 1eaab6fb8386dbae514f50f51f948c53cfbce551c4d725cf7c9e06f63d6d3cab
MD5 4d13b8ccf5f5bb80efd31ef119d47ca2
BLAKE2b-256 f0bc6d273c690dba252d8cdaab841fcfc0a17c87fe68d4463d8e95917644b3ac

See more details on using hashes here.

File details

Details for the file estimint-1.3.1-py3-none-any.whl.

File metadata

  • Download URL: estimint-1.3.1-py3-none-any.whl
  • Upload date:
  • Size: 15.9 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for estimint-1.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 1cab4f8c4d797baa6468ed7a61d1191fe767b89925c45624a290d9f1b1fc5f96
MD5 9b223cfa9460f821d0723d90641bad3c
BLAKE2b-256 7ba9a406232a659ea4c3d46d9a686acf3fd53b924f4007ea1f6da6cf82305128

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page