A converter for the LinkedIn Spark/Scala isolation forest model format to ONNX format.
Project description
isolation-forest-onnx
A converter for the LinkedIn Spark/Scala isolation forest model format to ONNX format for broad portability across platforms and languages.
Note: ONNX conversion is supported for the standard IsolationForestModel only. The ExtendedIsolationForestModel uses hyperplane-based splits that are not compatible with the axis-aligned tree ensemble representation used by the ONNX converter.
Installation
pip install isolation-forest-onnx
It is recommended to use the same version of the converter as the version of the isolation-forest library used to train the model.
Converting a trained model to ONNX
import os
from isolationforestonnx.isolation_forest_converter import IsolationForestConverter
# Path where the trained IsolationForestModel was saved in Scala
path = '/user/testuser/isolationForestWriteTest'
# Get model data path
data_dir_path = path + '/data'
avro_model_file = os.listdir(data_dir_path)
model_file_path = data_dir_path + '/' + avro_model_file[0]
# Get model metadata file path
metadata_dir_path = path + '/metadata'
metadata_file = os.listdir(metadata_dir_path)
metadata_file_path = metadata_dir_path + '/' + metadata_file[0]
# Convert the model to ONNX format (returns the ONNX model in memory)
converter = IsolationForestConverter(model_file_path, metadata_file_path)
onnx_model = converter.convert()
# Convert and save the model in ONNX format
onnx_model_path = '/user/testuser/isolationForestWriteTest.onnx'
converter.convert_and_save(onnx_model_path)
Using the ONNX model for inference
import numpy as np
import onnx
from onnxruntime import InferenceSession
onnx_model_path = '/user/testuser/isolationForestWriteTest.onnx'
dataset_path = 'shuttle.csv'
# Load data
input_data = np.loadtxt(dataset_path, delimiter=',')
num_features = input_data.shape[1] - 1
last_col_index = num_features
# The last column is the label column
input_dict = {'features': np.delete(input_data, last_col_index, 1).astype(dtype=np.float32)}
# Load the ONNX model and run inference
onx = onnx.load(onnx_model_path)
sess = InferenceSession(onx.SerializeToString())
res = sess.run(None, input_dict)
# Print scores
outlier_scores = res[0]
print(np.transpose(outlier_scores[:10])[0])
License
BSD 2-Clause License. See LICENSE for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file isolation_forest_onnx-4.1.5.tar.gz.
File metadata
- Download URL: isolation_forest_onnx-4.1.5.tar.gz
- Upload date:
- Size: 9.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ad28ba5a4850fa72ecc49784b592af5c5d3b233176068804a0f89458e0345ba8
|
|
| MD5 |
cda408c9629d43b5f7f2e41497440803
|
|
| BLAKE2b-256 |
8f635c5d41cd35d14ab8ecf731f67d838d0621f245d1fb8bcd875d38ee572930
|
File details
Details for the file isolation_forest_onnx-4.1.5-py3-none-any.whl.
File metadata
- Download URL: isolation_forest_onnx-4.1.5-py3-none-any.whl
- Upload date:
- Size: 6.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f1d9a949767cd91e66f9ae136a5eec954b8529728f3c9de8b9dcac3feb097efb
|
|
| MD5 |
4c9c785c19375ec82cf8ee64d542e369
|
|
| BLAKE2b-256 |
f5c7d4f6e424451582b476275548352b06e4e28e9047791e7efd7a45a347d595
|