A library to simplify model deployment to Vertex AI

These details have not been verified by PyPI

Project description

Orient Express

A library to accelerate model deployments to Vertex AI directly from colab notebooks

train-resized

Orient Express provides two main capabilities:

Vertex Model Deployment and Retrieval: Capabilities for uploading, downloading, or deploying models to Vertex AI Model Registry.
ONNX Image Model Deployment: Built-in predictor classes for easily running image classification, object detection, instance segmentation, and semantic segmentation models exported to ONNX format.

Both workflows handle versioning, artifact storage in GCS, and integration with Vertex AI Model Registry.

Installation

pip install orient_express

For local development:

pip install -e .

Or with Poetry:

poetry install

Workflows

ONNX Image Model Workflow

This workflow is for deploying image models (classification, detection, segmentation) exported to ONNX format.

from orient_express.predictors import ClassificationPredictor
from orient_express.vertex import upload_model, get_vertex_model

# 1. Create predictor from your exported ONNX model
predictor = ClassificationPredictor(
    onnx_path="model.onnx",
    classes={1: "cat", 2: "dog", 3: "bird"}
)

# 2. Upload to Vertex AI Model Registry
vertex_model = upload_model(
    model=predictor,
    model_name="my-classifier",
    project_name="my-project",
    region="us-central1",
    bucket_name="my-artifacts-bucket",
)

# 3. Later, retrieve and run locally
vertex_model = get_vertex_model(
    model_name="my-classifier",
    project_name="my-project",
    region="us-central1",
)
local_predictor = vertex_model.get_local_predictor()

from PIL import Image
images = [Image.open("test.jpg")]
predictions = local_predictor.predict(images)

# 4. Or deploy to an endpoint for remote inference
vertex_model.deploy_to_endpoint(
    endpoint_name="my-classifier-endpoint",
    machine_type="n1-standard-4",
    min_replica_count=1,
    max_replica_count=3,
)

# remote prediction API depends on the endpoint container deployed with the model
predictions = vertex_model.remote_predict(
    [{"image": "https://storage.googleapis.com/ssm-media-uploads/example.jpg"}], 
    endpoint_name="my-classifier-endpoint"
)

Joblib Model Workflow

This workflow is for deploying models that can be serialized with joblib, such as scikit-learn pipelines or XGBoost models.

from sklearn.pipeline import Pipeline
from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import StandardScaler, OneHotEncoder
from sklearn.impute import SimpleImputer
import xgboost as xgb
import seaborn as sns

from orient_express.vertex import upload_model_joblib, get_vertex_model

# 1. Train your model
data = sns.load_dataset('titanic').dropna(subset=['survived'])
X = data[['pclass', 'sex', 'age', 'sibsp', 'parch', 'fare', 'embarked']]
y = data['survived']

numeric_features = ['age', 'fare', 'sibsp', 'parch']
numeric_transformer = Pipeline(steps=[
    ('imputer', SimpleImputer(strategy='median')),
    ('scaler', StandardScaler())
])

categorical_features = ['pclass', 'sex', 'embarked']
categorical_transformer = Pipeline(steps=[
    ('imputer', SimpleImputer(strategy='most_frequent')),
    ('onehot', OneHotEncoder(handle_unknown='ignore'))
])

preprocessor = ColumnTransformer(transformers=[
    ('num', numeric_transformer, numeric_features),
    ('cat', categorical_transformer, categorical_features)
])

model = Pipeline(steps=[
    ('preprocessor', preprocessor),
    ('classifier', xgb.XGBClassifier(use_label_encoder=False, eval_metric='logloss'))
])

model.fit(X, y)

# 2. Upload to Vertex AI Model Registry
vertex_model = upload_model_joblib(
    model=model,
    model_name="titanic-classifier",
    project_name="my-project",
    region="us-central1",
    bucket_name="my-artifacts-bucket",
    serving_container_image_uri="your-serving-container:latest",
    serving_container_health_route="/health",
    serving_container_predict_route="/predict",
)

# 3. Later, retrieve the model
vertex_model = get_vertex_model(
    model_name="titanic-classifier",
    project_name="my-project",
    region="us-central1",
)

# 4. Run locally
local_predictor = vertex_model.get_local_predictor()
predictions = local_predictor.predict(X_test)

ONNX Runtime and Device Support

Platform Support Matrix

Platform	Architecture	ONNX Runtime Package	CUDA Available
Linux	x86_64	onnxruntime-gpu	Yes
Linux	aarch64	onnxruntime	No
Windows	x64 (AMD64)	onnxruntime-gpu	Yes
Windows	ARM64	onnxruntime	No
macOS	x86_64	onnxruntime	No
macOS	arm64	onnxruntime	No

The appropriate package is installed automatically based on your platform.

Selecting CPU vs CUDA Execution

When loading a predictor, use the device parameter to specify the execution provider:

from orient_express.predictors import ObjectDetectionPredictor

# CPU inference (works on all platforms)
predictor = ObjectDetectionPredictor("/path/to/model", classes, device="cpu")

# CUDA inference (requires Linux x64 or Windows x64 with CUDA drivers)
predictor = ObjectDetectionPredictor("/path/to/model", classes, device="cuda")

When using a Vertex AI model:

# CPU inference
predictor = model.get_local_predictor(device="cpu")

# CUDA inference
predictor = model.get_local_predictor(device="cuda")

Pinning Model Versions

By default, get_vertex_model returns the most recently updated version. To pin to a specific version:

vertex_model = get_vertex_model(
    model_name="my-classifier",
    project_name="my-project",
    region="us-central1",
    version=3,  # Pin to version 3
)

Built-in Predictor Types

Orient Express provides four built-in predictor classes for ONNX image models. Each has specific requirements for the ONNX graph structure.

General ONNX Requirements

All ONNX image models share these requirements:

Input images are resized using simple stretch (no letterboxing/padding) to the model's expected resolution before inference.
Normalization must be baked into the ONNX graph. The library passes uint8 RGB images directly to the model; any normalization (e.g., ImageNet mean/std) must be handled inside the graph.
Batch dimension: Models receive batched inputs with shape [batch, height, width, 3].

ClassificationPredictor

Click to expand

For image classification models that output class probabilities.

ONNX Graph Requirements


Inputs	`images`: `[batch, height, width, 3]` uint8 RGB
Outputs	`scores`: `[batch, num_classes]` float32 class probabilities

The graph must handle normalization internally. No target_sizes input is needed.

Usage

from orient_express.predictors import ClassificationPredictor

predictor = ClassificationPredictor(
    onnx_path="classifier.onnx",
    classes={1: "cat", 2: "dog", 3: "bird"}
)

predictions = predictor.predict(images)
# Returns: list[ClassificationPrediction]

Output Structure

@dataclass
class ClassificationPrediction:
    clss: str                      # Predicted class name
    score: float                   # Confidence score for predicted class
    class_scores: dict[str, float] # Scores for all classes

# to_dict() output:
{
    "class": "cat",
    "score": 0.95,
    "class_scores": {"cat": 0.95, "dog": 0.03, "bird": 0.02}
}

MultiLabelClassificationPredictor

Click to expand

For image multi-label classification models that output a set of binary class probabilities.

ONNX Graph Requirements


Inputs	`images`: `[batch, height, width, 3]` uint8 RGB
Outputs	`scores`: `[batch, num_classes]` float32 class probabilities

The graph must handle normalization internally. No target_sizes input is needed.

Usage

from orient_express.predictors import MultiLabelClassificationPredictor

predictor = MultiLabelClassificationPredictor(
    onnx_path="classifier.onnx",
    classes={1: "contains_cat", 2: "contains_dog", 3: "contains_bird"}
)

predictions = predictor.predict(images, confidence=0.5)
# Returns: list[MultiLabelClassificationPrediction]

Output Structure

@dataclass
class MultiLabelClassificationPrediction:
    classes: list[str]             # Predicted class names based on confidence threshold
    class_scores: dict[str, float] # Scores for all classes

# to_dict() output:
{
    "classes": ["contains_cat", "contains_bird"],
    "class_scores": {"contains_cat": 0.95, "contains_dog": 0.03, "contains_bird": 0.82}
}

BoundingBoxPredictor

Click to expand

For object detection models that output bounding boxes.

ONNX Graph Requirements


Inputs	`images`: `[batch, height, width, 3]` uint8 RGB
	`target_sizes`: `[batch, 2]` float32 containing `[height, width]` of original images
Outputs	`boxes`: `[batch, num_detections, 4]` float32 as `[x1, y1, x2, y2]` in original image coordinates
	`scores`: `[batch, num_detections]` float32 confidence scores
	`labels`: `[batch, num_detections]` int64 class indices

The ONNX graph must rescale bounding boxes to the original image dimensions using target_sizes. The library does not perform any box coordinate transformation.

Usage

from orient_express.predictors import BoundingBoxPredictor

predictor = BoundingBoxPredictor(
    onnx_path="detector.onnx",
    classes={1: "person", 2: "car", 3: "bicycle"}
)

predictions = predictor.predict(images, confidence=0.5, nms_threshold=0.4)
# Returns: list[list[BoundingBoxPrediction]]
# Outer list: per image, inner list: detections for that image

Output Structure

@dataclass
class BoundingBoxPrediction:
    clss: str           # Class name
    score: float        # Confidence score
    bbox: np.ndarray    # [x1, y1, x2, y2] in original image coordinates

# to_dict() output:
{
    "class": "person",
    "score": 0.92,
    "bbox": {"x1": 100.5, "y1": 50.2, "x2": 300.8, "y2": 400.1}
}

Annotation

annotated_image = predictor.get_annotated_image(image, predictions[0])
# Returns PIL.Image with bounding boxes drawn

InstanceSegmentationPredictor

Click to expand

For instance segmentation models that output bounding boxes and per-instance masks.

ONNX Graph Requirements


Inputs	`images`: `[batch, height, width, 3]` uint8 RGB
	`target_sizes`: `[batch, 2]` float32 containing `[height, width]` of original images
Outputs	`boxes`: `[batch, num_detections, 4]` float32 as `[x1, y1, x2, y2]` in original image coordinates
	`scores`: `[batch, num_detections]` float32 confidence scores
	`labels`: `[batch, num_detections]` int64 class indices
	`masks`: `[batch, num_detections, mask_height, mask_width]` float32 mask logits

The ONNX graph must rescale bounding boxes to original image dimensions using target_sizes. Masks can be any resolution—they are resized to original image dimensions in Python postprocessing using bilinear interpolation.

Usage

from orient_express.predictors import InstanceSegmentationPredictor

predictor = InstanceSegmentationPredictor(
    onnx_path="instance_seg.onnx",
    classes={1: "person", 2: "car", 3: "bicycle"}
)

predictions = predictor.predict(images, confidence=0.5)
# Returns: list[list[InstanceSegmentationPrediction]]

Output Structure

@dataclass
class InstanceSegmentationPrediction:
    clss: str           # Class name
    score: float        # Confidence score
    bbox: np.ndarray    # [x1, y1, x2, y2] in original image coordinates
    mask: np.ndarray    # Boolean mask at original image resolution

# to_dict(include_mask=False) output:
{
    "class": "person",
    "score": 0.89,
    "bbox": {"x1": 100.5, "y1": 50.2, "x2": 300.8, "y2": 400.1}
}

# to_dict(include_mask=True) adds:
{
    ...
    "mask": [[True, True, False, ...], ...]  # 2D boolean list
}

Annotation

annotated_image = predictor.get_annotated_image(image, predictions[0])
# Returns PIL.Image with mask overlays and instance indices

SemanticSegmentationPredictor

Click to expand

For semantic segmentation models that output per-pixel class predictions.

ONNX Graph Requirements


Inputs	`images`: `[batch, height, width, 3]` uint8 RGB
Outputs	`masks`: `[batch, num_classes, mask_height, mask_width]` float32 class logits

Masks can be any resolution—they are resized to original image dimensions in Python postprocessing. The class dimension is reduced via argmax to produce a single class ID per pixel.

Usage

from orient_express.predictors import SemanticSegmentationPredictor

predictor = SemanticSegmentationPredictor(
    onnx_path="semantic_seg.onnx",
    classes={0: "background", 1: "road", 2: "building", 3: "vegetation"}
)

predictions = predictor.predict(images)
# Returns: list[SemanticSegmentationPrediction]

Output Structure

@dataclass
class SemanticSegmentationPrediction:
    class_mask: np.ndarray   # [height, width] int array of class indices
    conf_masks: np.ndarray   # [num_classes, height, width] float confidence per class

# to_dict(include_conf_masks=False) output:
{
    "class_mask": [[0, 0, 1, 2, ...], ...]  # 2D int array
}

# to_dict(include_conf_masks=True) adds:
{
    ...
    "conf_masks": [[[0.1, 0.2, ...], ...], ...]  # 3D float array
}

Annotation

annotated_image = predictor.get_annotated_image(image, predictions[0].class_mask)
# Returns PIL.Image with color-coded segmentation overlay

Color Schemes

For predictors that support annotation (BoundingBoxPredictor, InstanceSegmentationPredictor, SemanticSegmentationPredictor), you can set a custom color scheme:

predictor.color_scheme = {
    "person": (255, 0, 0),    # Red (RGB)
    "car": (0, 255, 0),       # Green
    "bicycle": (0, 0, 255),   # Blue
}

Colors are specified as RGB tuples.

Legacy API [Still Maintained]

Click to expand

Example

Train Model

Train a regular model. In the example below, it's xgboost model, trained on the Titanic dataset.

# Import necessary libraries
import xgboost as xgb
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, confusion_matrix
from sklearn.pipeline import Pipeline
from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import StandardScaler, OneHotEncoder
from sklearn.impute import SimpleImputer

# Load the Titanic dataset
data = sns.load_dataset('titanic').dropna(subset=['survived'])  # Dropping rows with missing target labels

# Select features and target
X = data[['pclass', 'sex', 'age', 'sibsp', 'parch', 'fare', 'embarked']]
y = data['survived']

# Define preprocessing for numeric columns (impute missing values and scale features)
numeric_features = ['age', 'fare', 'sibsp', 'parch']
numeric_transformer = Pipeline(steps=[
    ('imputer', SimpleImputer(strategy='median')),
    ('scaler', StandardScaler())
])

# Define preprocessing for categorical columns (impute missing values and one-hot encode)
categorical_features = ['pclass', 'sex', 'embarked']
categorical_transformer = Pipeline(steps=[
    ('imputer', SimpleImputer(strategy='most_frequent')),
    ('onehot', OneHotEncoder(handle_unknown='ignore'))
])

# Combine preprocessing steps
preprocessor = ColumnTransformer(
    transformers=[
        ('num', numeric_transformer, numeric_features),
        ('cat', categorical_transformer, categorical_features)
    ])

# Create a pipeline that first transforms the data, then trains an XGBoost model
model = Pipeline(steps=[
    ('preprocessor', preprocessor),
    ('classifier', xgb.XGBClassifier(use_label_encoder=False, eval_metric='logloss'))
])

# Split the dataset into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train the model
model.fit(X_train, y_train)

Upload Model To Model Registry

model_wrapper  = ModelExpress(model=model,
                             project_name='my-project-name',
                             region='us-central1',
                             bucket_name='my-artifacts-bucket',
                             model_name='titanic')
model_wrapper.upload()

Local Inference (Without Online Prediction Endpoint)

The following code will download the last model from the model registry and run the inference locally.

# create input dataframe
titanic_data = {
    "pclass": [1],          # Passenger class (1st, 2nd, 3rd)
    "sex": ["female"],      # Gender
    "age": [29],            # Age
    "sibsp": [0],           # Number of siblings/spouses aboard
    "parch": [0],           # Number of parents/children aboard
    "fare": [100.0],        # Ticket fare
    "embarked": ["S"]       # Port of Embarkation (C = Cherbourg, Q = Queenstown, S = Southampton)
}
input_df = pd.DataFrame(titanic_data)

# init the model wrapper
model_wrapper  = ModelExpress(project_name='my-project-name',
                             region='us-central1',
                             model_name='titanic')

# Run inference locally
# It will download the most recent version from the model registry automatically
model_wrapper.local_predict(input_df)

Pin Model Version

In many cases, the pipeline should be pinned to a specific model version so the model can only be updated explicitly. Just pass a model_version parameter when instantiating the ModelExpress wrapper.

# init the model wrapper
model_wrapper  = ModelExpress(project_name='my-project-name',
                             region='us-central1',
                             model_name='titanic',
                             model_version=11)

Remote Inference (With Online Prediction Endpoint)

Make sure the model is deployed:

model_wrapper  = ModelExpress(model=model,
                             project_name='my-project-name',
                             region='us-central1',
                             bucket_name='my-artifacts-bucket',
                             model_name='titanic')

# upload the version to the registry and deploy it to the endpoint
model_wrapper.deploy()

Run inference with remote_predict method. It will make a remote call to the endpoint without fetching the model locally.

titanic_data = {
    "pclass": [1],             # Passenger class (1st, 2nd, 3rd)
    "sex": ["female"],         # Gender
    "age": [29],               # Age
    "sibsp": [0],              # Number of siblings/spouses aboard
    "parch": [0],              # Number of parents/children aboard
    "fare": [100.0],           # Ticket fare
    "embarked": ["S"]          # Port of Embarkation (C = Cherbourg, Q = Queenstown, S = Southampton)
}
df = pd.DataFrame(titanic_data)

model_wrapper.remote_predict(df)

Pipeline Deployment Function

Orient express library also have a helper function to simplify Vertex AI pipeline deployment.

Create deploy.py script

from orient_express.deployment import deploy_pipeline

import argparse
import conf

from pipeline import pipeline
from orient_express.deployment import deploy_pipeline


if __name__ == "__main__":
    parser = argparse.ArgumentParser()
    parser.add_argument("--run-type", required=True)

    args = parser.parse_args()
    deploy_pipeline(run_type=args.run_type,
                    pipeline_dsl=pipeline,
                    pipeline_root=conf.PIPELINE_ROOT,
                    pipeline_name=conf.PIPELINE_NAME,
                    pipeline_display_name=conf.PIPELINE_DISPLAY_NAME,
                    pipeline_schedule_name=conf.SCHEDULE_NAME,
                    gcp_project=conf.PROJECT_ID,
                    gcp_location='us-central1',
                    gcp_service_account=conf.SERVICE_ACCOUNT,
                    gcp_network=conf.NETWORK_NAME,
                    gcp_labels={"team": "ml"})

And conf.py, make sure to replace the sample values with yours.

import os

BASE_PATH = "gs://pipelines-bucket/vertex-ai/pipelines"

PIPELINE_NAME = "my-pipeline"
PIPELINE_ROOT = f"{BASE_PATH}/{PIPELINE_NAME}"
PIPELINE_TEMP_ROOT = f"{BASE_PATH}/{PIPELINE_NAME}-temp"

PIPELINE_DISPLAY_NAME = "My Pipeline"
PIPELINE_DESCRIPTION = "My example pipeline"

NETWORK_NAME = "project network id"

DOCKER_IMAGE = "us-docker.pkg.dev/my-project/my-artifactory/my-pipeline:latest
BASE_IMAGE = "python:3.11"
PROJECT_ID = "my-project"
PROJECT_REGION = "us-central1"

SERVICE_ACCOUNT = "my-service-account@my-project.iam.gserviceaccount.com"
SCHEDULE_NAME = "My Pipeline"

For testing it on a local machine, make sure to authorize to GCP first

gcloud auth application-default login

Finally, run the pipeline (it will execute once)

python deploy.py --run-type single-run

Or, create a scheduler to run continuously

python deploy.py --run-type scheduled

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

2.4.0

Mar 6, 2026

2.3.1

Feb 17, 2026

2.3.0

Feb 13, 2026

2.2.1

Feb 12, 2026

2.2.0

Feb 6, 2026

This version

2.1.3

Jan 23, 2026

2.1.2

Jan 20, 2026

2.1.1

Jan 19, 2026

2.1

Jan 19, 2026

2.0

Jan 16, 2026

1.0.1

Mar 4, 2025

0.4.2

Feb 4, 2025

0.4.1

Feb 4, 2025

0.3.4

Dec 12, 2024

0.3.3

Nov 13, 2024

0.3.2

Nov 8, 2024

0.3.1

Nov 8, 2024

0.2.5

Nov 6, 2024

0.2.4

Nov 6, 2024

0.2.3

Nov 6, 2024

0.2.2

Nov 6, 2024

0.2.1

Nov 6, 2024

0.1.2

Nov 6, 2024

0.1.1

Nov 6, 2024

0.1.0

Nov 6, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

orient_express-2.1.3.tar.gz (24.9 kB view details)

Uploaded Jan 23, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

orient_express-2.1.3-py3-none-any.whl (26.9 kB view details)

Uploaded Jan 23, 2026 Python 3

File details

Details for the file orient_express-2.1.3.tar.gz.

File metadata

Download URL: orient_express-2.1.3.tar.gz
Upload date: Jan 23, 2026
Size: 24.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/2.3.1 CPython/3.14.2 Linux/6.11.0-1018-azure

File hashes

Hashes for orient_express-2.1.3.tar.gz
Algorithm	Hash digest
SHA256	`e5ef0cb82b6a8b68584c77630acbc5924a1fd2de53e12b602a0aa56315bd5037`
MD5	`b4d6567402b1966b791736cafe51003f`
BLAKE2b-256	`43a994bc780191a095b9f6ad55689e48f336fc02c26d5a3b40f56d844be66599`

See more details on using hashes here.

File details

Details for the file orient_express-2.1.3-py3-none-any.whl.

File metadata

Download URL: orient_express-2.1.3-py3-none-any.whl
Upload date: Jan 23, 2026
Size: 26.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/2.3.1 CPython/3.14.2 Linux/6.11.0-1018-azure

File hashes

Hashes for orient_express-2.1.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`85eb1ad913c85ec6690fd2f7362e25d53a7250d30ffd0a610d415b018cad3193`
MD5	`0a612ee911740f7647476b06687f193f`
BLAKE2b-256	`a1a4ec7c7ce1dbd4dadf7e5e1f6aa219b87621087558e589af964d87cf627560`

See more details on using hashes here.

orient_express 2.1.3

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

Orient Express

Installation

Workflows

ONNX Image Model Workflow

Joblib Model Workflow

ONNX Runtime and Device Support

Platform Support Matrix

Selecting CPU vs CUDA Execution

Pinning Model Versions

Built-in Predictor Types

General ONNX Requirements

ClassificationPredictor

ONNX Graph Requirements

Usage

Output Structure

MultiLabelClassificationPredictor

ONNX Graph Requirements

Usage

Output Structure

BoundingBoxPredictor

ONNX Graph Requirements

Usage

Output Structure

Annotation

InstanceSegmentationPredictor

ONNX Graph Requirements

Usage

Output Structure

Annotation

SemanticSegmentationPredictor

ONNX Graph Requirements

Usage

Output Structure

Annotation

Color Schemes

Legacy API [Still Maintained]

Example

Train Model

Upload Model To Model Registry

Local Inference (Without Online Prediction Endpoint)

Pin Model Version

Remote Inference (With Online Prediction Endpoint)

Pipeline Deployment Function

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes