EIR Estimation using Machine learning INTerventions - Python port
Project description
estiMINT (Python)
Python port of the estiMINT R package for EIR (Entomological Inoculation Rate) estimation using machine learning.
Installation
pip install -e .
Or install dependencies directly:
pip install -r requirements.txt
File Mapping (R → Python)
| R File | Python File | Description |
|---|---|---|
estiMINT-package.R |
__init__.py |
Package initialization and exports |
globals.R |
globals.py |
Global variables and constants |
utils.R |
utils.py |
Utility functions (metrics, QMAP, etc.) |
data_processing.R |
data_processing.py |
Data loading and preprocessing |
models.R |
models.py |
XGBoost model training |
train.R |
train.py |
Main training pipeline with K-fold CV |
plotting.R |
plotting.py |
Visualization functions |
storage.R |
storage.py |
Model persistence and loading |
run.R |
run.py |
Model inference |
API Reference
Training
from estimint import train_xgb_model
model = train_xgb_model(
in_parquet="data/input.parquet",
out_dir="output/",
thr_lo=0.02, # Lower prevalence threshold
thr_hi=0.95, # Upper prevalence threshold
k_strata=16, # K-means strata for EIR
K=10, # CV folds
seed=42,
save_pkl=True,
save_plots=True,
save_artifacts=True
)
Inference
from estimint import load_xgb_model, run_xgb_model
import pandas as pd
# Load model
model = load_xgb_model("output/models/estiMINT_model.pkl")
# Prepare input data
new_data = pd.DataFrame({
"dn0_use": [0.5],
"Q0": [0.3],
"phi_bednets": [0.6],
"seasonal": [1],
"itn_use": [0.7],
"irs_use": [0.2],
"prev_y9": [0.15] # or "prevalence"
})
# Run prediction
eir_predictions = run_xgb_model(new_data, model)
print(f"Predicted EIR: {eir_predictions[0]:.2f}")
Using Global Model
from estimint import load_xgb_model, run_xgb_model, set_global_model
# Set global model once
model = load_xgb_model("output/models/estiMINT_model.pkl")
set_global_model(model)
# Run predictions without passing model
predictions = run_xgb_model(new_data) # Uses global model
Utility Functions
from estimint import (
r2, rmse, mse, mae, median_ae, mae_rel, rmsle, smape,
fit_qmap_w, predict_qmap_w, scale_pos
)
# Calculate metrics
y_true = [1, 2, 3, 4, 5]
y_pred = [1.1, 2.2, 2.9, 4.1, 4.8]
print(f"R²: {r2(y_true, y_pred):.4f}")
print(f"RMSE: {rmse(y_true, y_pred):.4f}")
print(f"MAE: {mae(y_true, y_pred):.4f}")
# Quantile mapping calibration
cal = fit_qmap_w(y_pred, y_true)
y_calibrated = predict_qmap_w(y_pred, cal)
Data Processing
from estimint import load_and_filter, make_value_weights, strata_and_split
# Load and filter parquet data
result = load_and_filter("data.parquet", thr_lo=0.02, thr_hi=0.95)
df = result["DT"]
df_excluded = result["DT_excluded"]
# Create inverse-frequency weights
weights = make_value_weights(df["eir"].values, digits=3)
# Stratified split
df["eir_log10"] = np.log10(df["eir"])
df = strata_and_split(df, k_strata=16, seed=42)
Key Differences from R Version
- File format: Models saved as
.pkl(pickle) instead of.rds - Data handling: Uses pandas instead of data.table
- Plotting: Uses matplotlib instead of ggplot2
- Global model: Use
set_global_model()/get_global_model()instead of.GlobalEnv
Dependencies
- numpy >= 1.20.0
- pandas >= 1.3.0
- duckdb >= 0.8.0
- xgboost >= 1.6.0
- scikit-learn >= 1.0.0
- matplotlib >= 3.4.0
- requests >= 2.28.0 (optional, for model download)
- appdirs >= 1.4.0 (optional, for cache directory)
License
MIT License
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
estimint-1.3.1.tar.gz
(15.6 MB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
estimint-1.3.1-py3-none-any.whl
(15.9 MB
view details)
File details
Details for the file estimint-1.3.1.tar.gz.
File metadata
- Download URL: estimint-1.3.1.tar.gz
- Upload date:
- Size: 15.6 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1eaab6fb8386dbae514f50f51f948c53cfbce551c4d725cf7c9e06f63d6d3cab
|
|
| MD5 |
4d13b8ccf5f5bb80efd31ef119d47ca2
|
|
| BLAKE2b-256 |
f0bc6d273c690dba252d8cdaab841fcfc0a17c87fe68d4463d8e95917644b3ac
|
File details
Details for the file estimint-1.3.1-py3-none-any.whl.
File metadata
- Download URL: estimint-1.3.1-py3-none-any.whl
- Upload date:
- Size: 15.9 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1cab4f8c4d797baa6468ed7a61d1191fe767b89925c45624a290d9f1b1fc5f96
|
|
| MD5 |
9b223cfa9460f821d0723d90641bad3c
|
|
| BLAKE2b-256 |
7ba9a406232a659ea4c3d46d9a686acf3fd53b924f4007ea1f6da6cf82305128
|