Machine-learning INtegrated Analysis with photometric Astronomical Surveys
Project description
MINAS — Machine-learning INtegrated Analysis with photometric Astronomical Surveys
MINAS is a Python package for the complete Machine Learning workflow applied to photometric astronomical surveys. It integrates all stages — from preprocessing to final model application — in a single, modular interface.
Fun fact: MINAS is also the name of a Brazilian state (Minas Gerais), the home state of Icaro Meidem, the package creator. As a proud mineiro, the name represents both the astronomical focus and personal heritage.
Installation
pip install minas
Quick Start — Full ML Workflow
import minas as mg
# 1. Load catalog
catalog = mg.read_csv('my_catalog.csv')
# 2. Assemble feature DataFrame (magnitudes + pairwise colors)
work_df = mg.preprocess.assemble_work_df(
df=catalog,
filters=mg.FILTERS['JPLUS'],
correction_pairs=dict(zip(mg.FILTERS['JPLUS'], mg.CORRECTIONS['JPLUS'])),
add_colors=True,
)
# 3. Select most important features
features, df_importance = mg.evaluation.get_important_features(
X=work_df,
y=catalog['Teff'],
n_features_to_save=20,
)
work_df = work_df[features]
# 4. Tune hyperparameters
param_dist = {
'selectkbest__k' : [10, 15, 20],
'randomforestregressor__n_estimators' : [100, 300, 500],
'randomforestregressor__min_samples_leaf': [1, 5, 10],
'randomforestregressor__max_features' : ['sqrt', 'log2'],
'randomforestregressor__bootstrap' : [True, False],
}
best_pipeline, search = mg.hyperparameter_search(
X=work_df,
Y=catalog['Teff'],
model_type='RF',
param_dist=param_dist,
tuning_id='teff_rf',
n_iter=30,
save_dir='pipelines/',
)
# 5. Apply model with Monte Carlo error propagation
predictor = mg.models.Predictor(
id_col='ID',
mag_cols=mg.FILTERS['JPLUS'],
err_cols=mg.ERRORS['JPLUS'],
dist_col=None,
correction_pairs=dict(zip(mg.FILTERS['JPLUS'], mg.CORRECTIONS['JPLUS'])),
models={'Teff': best_pipeline},
mc_reps=100,
batch_partitions=10,
)
predictor.predict_parameters((catalog, 'results/teff_predictions.csv', ['ID'], 'w', True))
Bolometric Correction
MINAS includes pre-trained models for bolometric correction (BC) based on Jordi et al. (2010), trained on Gaia-observed stars using Teff, log g, and [Fe/H].
Model Performance
| Model | R² | MAD | Std Deviation |
|---|---|---|---|
| XGBoost | 0.9983 | 0.0062 mag | 0.0430 mag |
| Random Forest | 0.9970 | 0.0067 mag | 0.0573 mag |
Figure: Performance of XGBoost (left) and Random Forest (right) for bolometric correction prediction.
Usage
import minas as mg
df = mg.bolometric.apply_bc(
data='catalog.csv',
teff_col='Teff',
logg_col='logg',
feh_col='[M/H]',
model_type='XGB', # 'XGB' or 'RF'
sigma_multiplier=3.0, # uncertainty = multiplier x STD
output_file='catalog_bc.csv',
)
print(df[['Teff', 'BC_pred', 'err_BC_pred']].head())
Reference
Jordi, C. et al. (2010). Gaia broad band photometry. A&A 523, A48. DOI: 10.1051/0004-6361/200913234
Supported Surveys and Filters
MINAS provides built-in filter definitions for the following photometric surveys.
All filter lists are accessible via mg.FILTERS, mg.ERRORS, and mg.CORRECTIONS.
| Survey | Filters | mg.FILTERS key |
|---|---|---|
| J-PLUS | uJAVA, J0378, J0395, J0410, J0430, gSDSS, J0515, rSDSS, J0660, iSDSS, J0861, zSDSS | 'JPLUS' |
| S-PLUS | uJAVA, J0378, J0395, J0410, J0430, gSDSS, J0515, rSDSS, J0660, iSDSS, J0861, zSDSS | 'SPLUS' |
| J-PAS | uJAVA + 56 narrow bands (J0378-J1007) + iSDSS | 'JPAS' |
| WISE | W1, W2, J, H, K | 'WISE' |
| GALEX | NUVmag | 'GALEX' |
| Gaia | G, BP, RP | 'GAIA' |
import minas as mg
print(mg.FILTERS['JPLUS']) # magnitude column names
print(mg.ERRORS['JPLUS']) # photometric error column names
print(mg.CORRECTIONS['JPLUS']) # extinction correction column names
Model Comparison — RF vs XGBoost
| Feature | Random Forest | XGBoost |
|---|---|---|
| Pipeline steps | Imputer → SelectKBest → RF | SelectKBest → XGB |
| Missing value handling | Built-in (median imputation) | Must be handled externally |
| Training speed | Moderate | Fast |
| Typical accuracy | Good | Excellent |
| Model key | 'RF-REG' / 'RF-CLA' |
'XGB-REG' / 'XGB-CLA' |
| Saved format | .sav (joblib) |
.json |
import minas as mg
# Default models
rf_model = mg.models.create_model('RF-REG')
xgb_model = mg.models.create_model('XGB-REG')
# With tuned hyperparameters
hp = (0.8, 0.05, 6, 500, 0.8, 0.1) # colsample, lr, depth, n_est, subsample, gamma
xgb_tuned = mg.models.create_model('XGB-REG', hp_combination=hp)
Package Structure
minas/
├── preprocess/ magnitude correction, color creation, work DataFrame assembly
├── models/ ML pipeline factory (RF, XGB) and Monte Carlo predictor
├── tuning/ hyperparameter search with RandomizedSearchCV
├── evaluation/ metrics (MAD, R2), plots, feature importance
└── bolometric/ bolometric correction with pre-trained models
Key Functions
| Function | Description |
|---|---|
mg.preprocess.assemble_work_df() |
Build feature DataFrame from magnitudes |
mg.preprocess.correct_magnitudes() |
Apply extinction corrections |
mg.preprocess.calculate_abs_mag() |
Convert apparent to absolute magnitudes |
mg.models.create_model() |
Create RF or XGBoost pipeline |
mg.models.Predictor |
Monte Carlo predictor with uncertainty estimation |
mg.hyperparameter_search() |
RandomizedSearchCV for RF or XGB |
mg.evaluation.get_important_features() |
Impurity-based feature importance (RF) |
mg.evaluation.get_permutation_importance_rf() |
Permutation importance (RF) |
mg.evaluation.get_permutation_importance_xgb() |
Permutation importance (XGB) |
mg.evaluation.calculate_mad() |
MAD per bin |
mg.evaluation.plot_test_graphs() |
Scatter + KDE error plot |
mg.evaluation.plot_comparison_graph() |
Bar chart comparison across models |
mg.bolometric.apply_bc() |
Apply pre-trained bolometric correction model |
Examples
The examples/ folder contains complete Jupyter notebooks covering the full workflow:
| Folder | Contents |
|---|---|
data/ |
Catalog creation and preprocessing |
tuning/ |
Hyperparameter search and feature importance |
training/ |
Model training, evaluation, and visualization |
apply/ |
Model application with Monte Carlo error propagation |
Citation
If you use MINAS in your research, please cite:
@software{minas,
author = {Meidem, Icaro},
title = {{MINAS}: Machine-learning INtegrated Analysis with photometric Astronomical Surveys},
year = {2025},
url = {https://github.com/icaromeidem/minas},
}
Bolometric correction reference:
- Jordi, C. et al. (2010), A&A 523, A48 — doi:10.1051/0004-6361/200913234
License
MIT © Icaro Meidem
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file minas-1.2.0.tar.gz.
File metadata
- Download URL: minas-1.2.0.tar.gz
- Upload date:
- Size: 3.8 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0edd63206170e3fcf1b53b19615fbe6dbdcdeddedd28755c3d434cd6efedece8
|
|
| MD5 |
4f94f7a4994794c5ecbfb9dd591e4a13
|
|
| BLAKE2b-256 |
bed75131ba67dc16183646f4990a361159d10c3d2a680adb6f878fde8593a95b
|
File details
Details for the file minas-1.2.0-py3-none-any.whl.
File metadata
- Download URL: minas-1.2.0-py3-none-any.whl
- Upload date:
- Size: 3.9 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ec5b89d13b0a784f6f5136bd0f983b8921123f48bd5b63b11e107e47a89b9851
|
|
| MD5 |
2d6984706ef8c2b8007d833c1f65a1b1
|
|
| BLAKE2b-256 |
c01d331c66cd9a5cea93251566699b4278e06552eb6e08738647dcaba24591e8
|