A ensemble framework for explainable geospatial machine Learning models

Project description

An Ensemble Framework for Geospatial Machine Learning Models

GitHub: https://github.com/UrbanGISer/XGeoML

PYPI Homepage: https://pypi.org/project/XGeoML/0.1.4/

Installation: pip install XGeoML

This package addresses the critical challenge of analyzing and interpreting spatially varying effects in geographic analysis, stemming from the complexity and non-linearity of geospatial data. We introduce an innovative integrated framework that combines local spatial weights, Explainable Artificial Intelligence (XAI), and advanced machine learning technologies. This approach significantly bridges the gap between traditional geographic analysis models and contemporary machine learning methodologies.

Introduction

Geospatial data is inherently complex and non-linear, presenting significant challenges in analysis and interpretation. Traditional geographic analysis models often struggle to address these challenges, leading to gaps in understanding and interpretation.

Our Approach

We propose an innovative integrated framework that leverages local spatial weights, Explainable Artificial Intelligence (XAI), and advanced machine learning technologies. Our approach aims to bridge the gap between traditional methods and modern machine learning techniques, offering a more comprehensive tool for geographic analysis.

Features

Local Spatial Weights: Incorporates the spatial context of data, enhancing model sensitivity to geographical nuances.
Explainable Artificial Intelligence (XAI): Provides clarity on the decision-making process, improving the interpretability of the model's predictions.
Advanced Machine Learning Technologies: Utilizes cutting-edge algorithms to manage the complexity and non-linearity of geospatial data effectively.

Key Functions

Use built-in Spatial Weights: Generate Gaussian, Binary and GaussianBinary weight.

weights=w_matrix.spatial_weight(df, "u", "v", fix=False, bandwidth=80, kernel_type='Binary')

Import libpysal Spatial Weights: Accept all spatial weight.

import libpysal.weights as lw
points = df[['u', 'v']].values
w=lw.DistanceBand(points,threshold=6,binary=False)
weightpysal=w_matrix.from_libpysal(w)

Predict or Search Bandwidth with fast training model: Accept all sci-learn model.

# 01 Define key variables
feature_names=['x1','x2','x3','x4']
target_name="y"
explainer_names = ["LIME","SHAP", 'Importance']
turebeta= ['b_linear','b_circular', 'b_cos_basic',  'b_poly']

# 02 import  sklearn ML model
from sklearn.ensemble import  GradientBoostingRegressor
model=GradientBoostingRegressor

# 03 import  R2
from sklearn.metrics  import r2_score

# 04 generate weights
weights=w_matrix.spatial_weight(df, "u", "v", fix=False, bandwidth=80, kernel_type='Binary')

# 05 Bandwidth Searching
eval_bandwidth = pd.DataFrame()
for i in range(10):
    k=40+i*40
    for j in range(3):
        weights=w_matrix.spatial_weight(df, "u", "v", fix=False, bandwidth=k, kernel_type='Binary')
        dfx=fast_train.predict(df, feature_names, target_name, weights, model)
        r22=r2_score(dfx.y,dfx.predy)
        eval_bandwidth.loc[i, j]=r22

Predict and Evaluate with updated Spatial Weights: Using new spatial weight based on bandwidth searching.

# 06 Predict
df_pred=fast_train.predict(df, feature_names, target_name, weights, model)
# 07 Evaluate
from sklearn.metrics import r2_score
r2_score(df_pred.y,df_pred.predy)

Explain models: explainer_names must be in a list["LIME","SHAP", 'Importance'].

# 08 Explain
df_explain=fast_train.explain(df, feature_names, target_name, weights,model, explainer_names)

Partial Dependence Estimation: Sample bin is used here, two mode: even or original values.

# 09 Partial dependence
df_pd=fast_train.partial_dependence(df, model,  feature_names, target_name, weights,num_samples=50,even=False)

Use trained models: MUST BE CAREFUL, It might be time consuming while use HyperOpt.

#10 Trained models
sk_models,predictions=train_model.train_sklearn(df, feature_names, target_name, weights, model)
# 11 Explain with trained sci-learn models
df_sk=train_model.explain_models(df, feature_names, target_name, weights, sk_models, explainer_names)
#12 PDE with trained sci-learn models
df_sk_pd_even=train_model.partial_dependence_model(df, sk_models, feature_names, target_name, weights,num_samples=50)

# 13 Explain with trained HyperOpt models: SUPER TIME CONSUMING
from hpsklearn import HyperoptEstimator, xgboost_regression, mlp_regressor
from hyperopt import tpe
hymodel=xgboost_regression
#max_eval=5 for 900 points, it takes 3 hours
hy_models,predictions=train_model.train_hysklearn(df, feature_names, target_name, weights, hymodel,max_evals=1, trial_timeout=60)

# 14 Explain with trained models. MUST set  skleanrmodel=False
df_hy=train_model.explain_models(df, feature_names, target_name, weights, hy_models, explainer_names,skleanrmodel=False)
# 14 Partial dependence with trained models. Same as Previous one
df_hy_pd_even=train_model.partial_dependence_model(df, hy_models, feature_names, target_name, weights,num_samples=50)

Through rigorous testing on synthetic datasets and real-world dataset, our framework has proven to enhance the interpretability and accuracy of geospatial predictions in both regression and classification tasks. It effectively elucidates spatial variability, representing a significant advancement in the precision of predictions and offering a novel perspective for understanding spatial phenomena.

Conclusion

Our integrated framework marks a significant step forward in geographic analysis. By combining local spatial weights, XAI, and advanced machine learning, we offer a powerful tool for analyzing and interpreting complex geospatial data. This approach not only improves the accuracy and interpretability of geospatial predictions but also provides a fresh perspective on spatial phenomena.

Contact

For further information, inquiries, or collaborations, please contact us at lingboliu@fas.harvard.edu.

Project details

Release history Release notifications | RSS feed

0.1.6

Mar 3, 2024

This version

0.1.5

Mar 1, 2024

0.1.4

Feb 28, 2024

0.1.3

Feb 28, 2024

0.1.2

Feb 28, 2024

0.1.1

Feb 27, 2024

0.1

Feb 27, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

XGeoML-0.1.5.tar.gz (9.3 kB view details)

Uploaded Mar 1, 2024 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

XGeoML-0.1.5-py3-none-any.whl (10.8 kB view details)

Uploaded Mar 1, 2024 Python 3

File details

Details for the file XGeoML-0.1.5.tar.gz.

File metadata

Download URL: XGeoML-0.1.5.tar.gz
Upload date: Mar 1, 2024
Size: 9.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.0.0 CPython/3.9.12

File hashes

Hashes for XGeoML-0.1.5.tar.gz
Algorithm	Hash digest
SHA256	`db89a7479e8e13a49dc85699f2e3365549236dce1dad444c3bd9687ef1af41b4`
MD5	`caf14fcf072fc466953a2ede469edd4a`
BLAKE2b-256	`dc0cf01963870ef47e9a4ee7f4be38457be4242c9282bb25531bb45396fb8f8e`

See more details on using hashes here.

File details

Details for the file XGeoML-0.1.5-py3-none-any.whl.

File metadata

Download URL: XGeoML-0.1.5-py3-none-any.whl
Upload date: Mar 1, 2024
Size: 10.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.0.0 CPython/3.9.12

File hashes

Hashes for XGeoML-0.1.5-py3-none-any.whl
Algorithm	Hash digest
SHA256	`93420e47629e2a06978cc09a0aca1e6492c259eb19392a225284d73ac7682397`
MD5	`e2ce1231b1012d4752359c3080e02cfb`
BLAKE2b-256	`6b970e45dc91005f796281f9b3f8cd1637a2118fe9359b53c724164bbbc0d2be`

See more details on using hashes here.

XGeoML 0.1.5

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

An Ensemble Framework for Geospatial Machine Learning Models

Introduction

Our Approach

Features

Key Functions

Conclusion

Contact

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes