Skip to main content

A converter for the LinkedIn Spark/Scala isolation forest model format to ONNX format.

Project description

isolation-forest-onnx

A converter for the LinkedIn Spark/Scala isolation forest model format to ONNX format for broad portability across platforms and languages.

Note: ONNX conversion is supported for the standard IsolationForestModel only. The ExtendedIsolationForestModel uses hyperplane-based splits that are not compatible with the axis-aligned tree ensemble representation used by the ONNX converter.

Installation

pip install isolation-forest-onnx

It is recommended to use the same version of the converter as the version of the isolation-forest library used to train the model.

Converting a trained model to ONNX

import os
from isolationforestonnx.isolation_forest_converter import IsolationForestConverter

# Path where the trained IsolationForestModel was saved in Scala
path = '/user/testuser/isolationForestWriteTest'

# Get model data path
data_dir_path = path + '/data'
avro_model_file = os.listdir(data_dir_path)
model_file_path = data_dir_path + '/' + avro_model_file[0]

# Get model metadata file path
metadata_dir_path = path + '/metadata'
metadata_file = os.listdir(metadata_dir_path)
metadata_file_path = metadata_dir_path + '/' + metadata_file[0]

# Convert the model to ONNX format (returns the ONNX model in memory)
converter = IsolationForestConverter(model_file_path, metadata_file_path)
onnx_model = converter.convert()

# Convert and save the model in ONNX format
onnx_model_path = '/user/testuser/isolationForestWriteTest.onnx'
converter.convert_and_save(onnx_model_path)

Using the ONNX model for inference

import numpy as np
import onnx
from onnxruntime import InferenceSession

onnx_model_path = '/user/testuser/isolationForestWriteTest.onnx'
dataset_path = 'shuttle.csv'

# Load data
input_data = np.loadtxt(dataset_path, delimiter=',')
num_features = input_data.shape[1] - 1
last_col_index = num_features

# The last column is the label column
input_dict = {'features': np.delete(input_data, last_col_index, 1).astype(dtype=np.float32)}

# Load the ONNX model and run inference
onx = onnx.load(onnx_model_path)
sess = InferenceSession(onx.SerializeToString())
res = sess.run(None, input_dict)

# Print scores
outlier_scores = res[0]
print(np.transpose(outlier_scores[:10])[0])

License

BSD 2-Clause License. See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

isolation_forest_onnx-4.1.5.tar.gz (9.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

isolation_forest_onnx-4.1.5-py3-none-any.whl (6.5 kB view details)

Uploaded Python 3

File details

Details for the file isolation_forest_onnx-4.1.5.tar.gz.

File metadata

  • Download URL: isolation_forest_onnx-4.1.5.tar.gz
  • Upload date:
  • Size: 9.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for isolation_forest_onnx-4.1.5.tar.gz
Algorithm Hash digest
SHA256 ad28ba5a4850fa72ecc49784b592af5c5d3b233176068804a0f89458e0345ba8
MD5 cda408c9629d43b5f7f2e41497440803
BLAKE2b-256 8f635c5d41cd35d14ab8ecf731f67d838d0621f245d1fb8bcd875d38ee572930

See more details on using hashes here.

File details

Details for the file isolation_forest_onnx-4.1.5-py3-none-any.whl.

File metadata

File hashes

Hashes for isolation_forest_onnx-4.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 f1d9a949767cd91e66f9ae136a5eec954b8529728f3c9de8b9dcac3feb097efb
MD5 4c9c785c19375ec82cf8ee64d542e369
BLAKE2b-256 f5c7d4f6e424451582b476275548352b06e4e28e9047791e7efd7a45a347d595

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page