Skip to main content

A converter for the LinkedIn Spark/Scala isolation forest model format to ONNX format.

Project description

isolation-forest-onnx

A converter for the LinkedIn Spark/Scala isolation forest model format to ONNX format for broad portability across platforms and languages.

Note: ONNX conversion is supported for the standard IsolationForestModel only. The ExtendedIsolationForestModel uses hyperplane-based splits that are not compatible with the axis-aligned tree ensemble representation used by the ONNX converter.

Installation

pip install isolation-forest-onnx

It is recommended to use the same version of the converter as the version of the isolation-forest library used to train the model.

Converting a trained model to ONNX

import os
from isolationforestonnx.isolation_forest_converter import IsolationForestConverter

# Path where the trained IsolationForestModel was saved in Scala
path = '/user/testuser/isolationForestWriteTest'

# Get model data path
data_dir_path = path + '/data'
avro_model_file = os.listdir(data_dir_path)
model_file_path = data_dir_path + '/' + avro_model_file[0]

# Get model metadata file path
metadata_dir_path = path + '/metadata'
metadata_file = os.listdir(metadata_dir_path)
metadata_file_path = metadata_dir_path + '/' + metadata_file[0]

# Convert the model to ONNX format (returns the ONNX model in memory)
converter = IsolationForestConverter(model_file_path, metadata_file_path)
onnx_model = converter.convert()

# Convert and save the model in ONNX format
onnx_model_path = '/user/testuser/isolationForestWriteTest.onnx'
converter.convert_and_save(onnx_model_path)

Using the ONNX model for inference

import numpy as np
import onnx
from onnxruntime import InferenceSession

onnx_model_path = '/user/testuser/isolationForestWriteTest.onnx'
dataset_path = 'shuttle.csv'

# Load data
input_data = np.loadtxt(dataset_path, delimiter=',')
num_features = input_data.shape[1] - 1
last_col_index = num_features

# The last column is the label column
input_dict = {'features': np.delete(input_data, last_col_index, 1).astype(dtype=np.float32)}

# Load the ONNX model and run inference
onx = onnx.load(onnx_model_path)
sess = InferenceSession(onx.SerializeToString())
res = sess.run(None, input_dict)

# Print scores
outlier_scores = res[0]
print(np.transpose(outlier_scores[:10])[0])

License

BSD 2-Clause License. See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

isolation_forest_onnx-4.1.4.tar.gz (9.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

isolation_forest_onnx-4.1.4-py3-none-any.whl (6.5 kB view details)

Uploaded Python 3

File details

Details for the file isolation_forest_onnx-4.1.4.tar.gz.

File metadata

  • Download URL: isolation_forest_onnx-4.1.4.tar.gz
  • Upload date:
  • Size: 9.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for isolation_forest_onnx-4.1.4.tar.gz
Algorithm Hash digest
SHA256 bcadd9e086e7077f9ad1c9b7167a7757ae57520df3b92f8b8e95deeaee1122c3
MD5 4b4e751e75dfd0c624fadc2159915207
BLAKE2b-256 7f38cfad310b3b61db1ea944f1bbabf2aaadeb8ff1da979db07613d99e4ce1c2

See more details on using hashes here.

File details

Details for the file isolation_forest_onnx-4.1.4-py3-none-any.whl.

File metadata

File hashes

Hashes for isolation_forest_onnx-4.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 6e3cdcd3377a0b5f0651f4be445ccc398f9f616770a396af4b6ba0fe6ae76bed
MD5 2d7e8320a12c730f1c5ba8b83df5d449
BLAKE2b-256 41a31955ba2dae1a42a8c7cc5d1af552c828661e7d2bf4a0923eee7e001272c7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page