Skip to main content

A converter for the LinkedIn Spark/Scala isolation forest model format to ONNX format.

Project description

isolation-forest-onnx

A converter for the LinkedIn Spark/Scala isolation forest model format to ONNX format for broad portability across platforms and languages.

Note: ONNX conversion is supported for the standard IsolationForestModel only. The ExtendedIsolationForestModel uses hyperplane-based splits that are not compatible with the axis-aligned tree ensemble representation used by the ONNX converter.

Installation

pip install isolation-forest-onnx

It is recommended to use the same version of the converter as the version of the isolation-forest library used to train the model.

Converting a trained model to ONNX

import os
from isolationforestonnx.isolation_forest_converter import IsolationForestConverter

# Path where the trained IsolationForestModel was saved in Scala
path = '/user/testuser/isolationForestWriteTest'

# Get model data path
data_dir_path = path + '/data'
avro_model_file = os.listdir(data_dir_path)
model_file_path = data_dir_path + '/' + avro_model_file[0]

# Get model metadata file path
metadata_dir_path = path + '/metadata'
metadata_file = os.listdir(metadata_dir_path)
metadata_file_path = metadata_dir_path + '/' + metadata_file[0]

# Convert the model to ONNX format (returns the ONNX model in memory)
converter = IsolationForestConverter(model_file_path, metadata_file_path)
onnx_model = converter.convert()

# Convert and save the model in ONNX format
onnx_model_path = '/user/testuser/isolationForestWriteTest.onnx'
converter.convert_and_save(onnx_model_path)

Using the ONNX model for inference

import numpy as np
import onnx
from onnxruntime import InferenceSession

onnx_model_path = '/user/testuser/isolationForestWriteTest.onnx'
dataset_path = 'shuttle.csv'

# Load data
input_data = np.loadtxt(dataset_path, delimiter=',')
num_features = input_data.shape[1] - 1
last_col_index = num_features

# The last column is the label column
input_dict = {'features': np.delete(input_data, last_col_index, 1).astype(dtype=np.float32)}

# Load the ONNX model and run inference
onx = onnx.load(onnx_model_path)
sess = InferenceSession(onx.SerializeToString())
res = sess.run(None, input_dict)

# Print scores
outlier_scores = res[0]
print(np.transpose(outlier_scores[:10])[0])

License

BSD 2-Clause License. See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

isolation_forest_onnx-4.1.2.tar.gz (9.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

isolation_forest_onnx-4.1.2-py3-none-any.whl (6.4 kB view details)

Uploaded Python 3

File details

Details for the file isolation_forest_onnx-4.1.2.tar.gz.

File metadata

  • Download URL: isolation_forest_onnx-4.1.2.tar.gz
  • Upload date:
  • Size: 9.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.25

File hashes

Hashes for isolation_forest_onnx-4.1.2.tar.gz
Algorithm Hash digest
SHA256 2b46a77a81ed4f5bb6188676b6a63b14907346efdcd369d661e7c75e53650659
MD5 69ab4033317f5cf0b269424a6c28a685
BLAKE2b-256 51233cc1fb421d1c035f8ff8d922712072c4c8ecf3bbefbb3546056e4edd77e3

See more details on using hashes here.

File details

Details for the file isolation_forest_onnx-4.1.2-py3-none-any.whl.

File metadata

File hashes

Hashes for isolation_forest_onnx-4.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 77cc1de5b49d7dc85e738ae25adc90872895160aac1ec41c9d42084c6eda400d
MD5 0a6144a1cb44e1db249ef786e6a1a22b
BLAKE2b-256 20204c5d87727d2a8956270987b0b43b29ddf0fdc8a3a430e1d0a57e7a11b579

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page