Haute — Open-source pricing engine for insurance teams on Databricks

These details have not been verified by PyPI

Project description

Haute

Open-source pricing engine for insurance teams on Databricks.

Build, visualise, and deploy pricing pipelines as Python code — with a browser-based GUI that stays in sync.

pip install haute

What is Haute?

Haute gives insurance pricing teams a code-first, GUI-friendly way to build rating pipelines. Write standard Python with Polars, see it instantly in a visual editor, and deploy to a live API with one command.

Build pipelines in code or the GUI — both stay in sync
Run the same pipeline for 1-row live quotes and million-row batch jobs
Deploy to Databricks MLflow Model Serving with haute deploy

Python code is always the source of truth. The GUI is a live, editable view.

Quick Start

1. Install

pip install haute

# For deployment to Databricks:
pip install haute[databricks]

2. Create a project

mkdir my_project && cd my_project
haute init

This scaffolds everything you need in the current directory:

haute.toml              ← project & deploy config
.env.example           ← Databricks credentials template
main.py                ← starter pipeline
data/                  ← your data files
test_quotes/           ← JSON payloads for pre-deploy testing

3. Write a pipeline

# main.py
import polars as pl
import haute

pipeline = haute.Pipeline("motor_pricing", description="Motor premium calculation")


@pipeline.node(path="data/policies.parquet", deploy_input=True)
def policies() -> pl.DataFrame:
    """Read policy data — this is the live API input."""
    return pl.scan_parquet("data/policies.parquet")


@pipeline.node(external="models/freq.cbm", file_type="catboost", model_class="regressor")
def frequency_model(policies: pl.DataFrame) -> pl.DataFrame:
    """Predict claim frequency."""
    df = policies.with_columns(
        freq_pred=pl.Series(obj.predict(policies.select("Area", "VehPower", "DrivAge").to_numpy()))
    )
    return df


@pipeline.node
def calculate_premium(frequency_model: pl.DataFrame) -> pl.DataFrame:
    """Calculate the technical premium."""
    return frequency_model.with_columns(
        premium=(pl.col("freq_pred") * 500).round(2)
    )


@pipeline.node(output=True)
def output(calculate_premium: pl.DataFrame) -> pl.DataFrame:
    """Final output returned by the API."""
    return calculate_premium

4. Run it

haute run

5. Open the GUI

haute serve

This opens a browser-based visual editor where you can:

Drag and drop nodes from a palette
Connect them with edges to define data flow
Write Polars code in each transform node
Click any node to preview its output data
Toggle API Input on a data source to mark it as the live input
Hit Run to execute the full pipeline
Hit Save to write back to .py

6. Deploy

cp .env.example .env     # fill in your Databricks credentials
haute deploy

That's it. Your pipeline is now a live API on Databricks Model Serving.

Deployment

Haute deploys your pipeline as an MLflow model on Databricks Model Serving. One command, no DevOps.

How it works

Marks — you tag one data source as deploy_input=True (the live API input) and one node as output=True (the API response)
Prunes — Haute traces backwards from the output node and deploys only the scoring path. Training data branches, sinks, and exploratory nodes are automatically excluded
Bundles — model files (.cbm, .pkl, etc.) and static data are packaged as MLflow artifacts
Validates — every JSON file in test_quotes/ is scored through the pruned pipeline before deployment. If anything fails, deployment is blocked
Deploys — the pipeline is logged as an MLflow pyfunc model and registered in the Model Registry

Configuration

All deploy settings live in haute.toml (committed to git):

[project]
name = "motor-pricing"
pipeline = "main.py"

[deploy]
target = "databricks"
model_name = "motor-pricing"
endpoint_name = "motor-pricing"

[deploy.databricks]
experiment_name = "/Shared/haute/motor-pricing"
catalog = "main"
schema = "pricing"
serving_workload_size = "Small"
serving_scale_to_zero = true

[test_quotes]
dir = "test_quotes"

Secrets go in .env (gitignored):

DATABRICKS_HOST=https://adb-xxxxx.azuredatabricks.net
DATABRICKS_TOKEN=your_token_here

Test quotes

Put JSON files in test_quotes/ with example requests. These are scored before every deploy:

[
  {"IDpol": 99001, "VehPower": 7, "DrivAge": 42, "Area": "C", "VehBrand": "B12"}
]

Dry run

Validate everything without actually deploying:

haute deploy --dry-run

  ✓ Loaded config from haute.toml
  ✓ Parsed pipeline (12 nodes, 14 edges)
  ✓ Pruned to output ancestors (5 nodes)
  ✓ Collected 2 artifacts (freq.cbm, sev.cbm)
  ✓ Inferred input schema (10 columns)
  ✓ Test quotes: single_policy.json     1 rows  ok  (18ms)
  ✓ Test quotes: batch_policies.json    5 rows  ok  (24ms)
  ✓ Validation passed
  Dry run complete — no model was deployed.

Calling the deployed API

curl -X POST https://<workspace>.databricks.net/serving-endpoints/motor-pricing/invocations \
  -H "Authorization: Bearer $TOKEN" \
  -d '{"dataframe_records": [{"Area": "A", "VehPower": 5, "DrivAge": 35}]}'

Key Concepts

Pipelines

A pipeline is a DAG of decorated Python functions. Each function is a node that takes DataFrames in and returns a DataFrame out.

pipeline = haute.Pipeline("my_pipeline")

@pipeline.node
def transform(read_data: pl.DataFrame) -> pl.DataFrame:
    return read_data.filter(pl.col("age") > 25)

Edges

Edges define data flow. Function parameter names match upstream node names:

pipeline.connect("read_data", "transform")
pipeline.connect("transform", "output")

Fan-out / Fan-in

One node can feed multiple downstream nodes, and a node can receive multiple inputs:

@pipeline.node
def joined(claims: pl.DataFrame, exposure: pl.DataFrame) -> pl.DataFrame:
    return claims.join(exposure, on="IDpol", how="left")

pipeline.connect("claims", "joined")
pipeline.connect("exposure", "joined")

Scoring

The same pipeline code works for batch and live scoring:

# Batch: run the full pipeline
result = pipeline.run()

# Live: score a single row (same code path as the deployed API)
row = pl.DataFrame({"Area": ["A"], "DrivAge": [35]})
prediction = pipeline.score(row)

Node Types

Data Source

Reads data from a file. No code needed — just configure the path.

@pipeline.node(path="data/policies.parquet", deploy_input=True)
def policies() -> pl.DataFrame:
    return pl.scan_parquet("data/policies.parquet")

deploy_input=True — marks this source as the live API input for deployment
Supported formats: Parquet, CSV, JSON

Transform

The workhorse node. Write Polars code to filter, join, aggregate, or reshape data.

@pipeline.node
def frequency_set(policies: pl.DataFrame, claims: pl.DataFrame) -> pl.DataFrame:
    return policies.join(claims, on="IDpol", how="left")

In the GUI, two shorthand syntaxes are available:

Chain syntax — start with . to chain off the first input: .filter(pl.col("Area") == "A").select("IDpol", "premium")
Expression syntax — reference multiple inputs by name: policies.join(claims, on="IDpol", how="left")

External File

Load a model or config file, then use it in your code. The loaded object is available as obj.

@pipeline.node(external="models/freq.cbm", file_type="catboost", model_class="regressor")
def frequency_model(policies: pl.DataFrame) -> pl.DataFrame:
    df = policies.with_columns(
        freq_pred=pl.Series(obj.predict(policies.select("Area", "VehAge").to_numpy()))
    )
    return df

Type	Extension	How it loads
Pickle	`.pkl`	`pickle.load()`
JSON	`.json`	`json.load()`
Joblib	`.joblib`	`joblib.load()`
CatBoost	`.cbm`	`CatBoostClassifier` / `CatBoostRegressor`

Output

Marks the final node whose result becomes the API response:

@pipeline.node(output=True)
def output(calculate_premium: pl.DataFrame) -> pl.DataFrame:
    return calculate_premium

Data Sink

Writes data to disk. Sinks are pass-through during normal runs — writing only happens when you click Write in the GUI.

@pipeline.node(sink="output/frequency.parquet", format="parquet")
def frequency_write(frequency_set: pl.DataFrame) -> pl.DataFrame:
    return frequency_set

GUI

The visual editor runs in your browser at http://localhost:5173.

Area	Description
Left palette	Drag node types onto the canvas
Center canvas	Visual DAG with drag, zoom, connect
Right panel	Configure the selected node
Bottom panel	Data preview — click any node to see its output

Nodes marked deploy_input=True show a green API badge. Toggle it on/off in the node's config panel.

Code ↔ GUI sync

Everything round-trips:

Edit in the GUI → saves back to .py
Edit the .py in your text editor → GUI picks it up on next load
Custom imports, helper functions, and constants are preserved in both directions

Pipeline Imports & Helpers

Every pipeline starts with import polars as pl and import haute. Add extra imports or helper functions via the Imports button (⚙) in the GUI toolbar, or write them directly in the .py file between the standard imports and the first @pipeline.node.

import numpy as np
from catboost import CatBoostClassifier

DISCOUNT_RATE = 0.95

def apply_discount(df, col):
    return df.with_columns(pl.col(col) * DISCOUNT_RATE)

CLI Reference

Command	Description
`haute init`	Scaffold a new project in the current directory
`haute run [file]`	Execute a pipeline and print results
`haute serve`	Start the visual editor
`haute deploy [file]`	Deploy the pipeline as a live API
`haute deploy --dry-run`	Validate and score test quotes without deploying
`haute status [model]`	Check the status of a deployed model

`haute serve` options

Flag	Default	Description
`--host`	`127.0.0.1`	Host to bind to
`--port`	`8000`	Backend API port
`--no-browser`	off	Don't auto-open the browser

`haute deploy` options

Flag	Description
`--model-name`	Override model name from `haute.toml`
`--dry-run`	Validate and score test quotes without deploying

Project Structure

After haute init, your project looks like:

haute.toml                ← project & deploy config (committed)
.env.example             ← Databricks credentials template (committed)
.env                     ← actual credentials (gitignored)
.gitignore
main.py                  ← pipeline code (source of truth)
main.haute.json           ← GUI layout state (node positions)
data/                    ← data files (.parquet, .csv)
test_quotes/             ← JSON payloads for pre-deploy validation
  example.json

.py files are the source of truth — diffable, reviewable, testable
.haute.json files store GUI layout (node positions) — not execution logic
haute.toml is the single config file for project settings and deployment

Design Principles

Code is the source of truth — the .py file is the pipeline. The GUI is a view.
Same pipeline, every context — the same code runs for 1-row live quotes and million-row batch jobs.
Real Python, real Polars — no proprietary formula language. Your skills transfer.
Git-native — pipelines are plain files. Diff, review, branch, merge.
One-command deploy — haute deploy handles pruning, bundling, validation, and MLflow registration.
Testable — every node is a plain function. pytest just works.

Requirements

Python >= 3.11
For deployment: pip install haute[databricks] (adds MLflow + Databricks SDK)
Works on Linux, macOS, and Windows

License

MIT

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.3.3

Mar 30, 2026

0.3.2

Mar 23, 2026

0.3.1

Mar 21, 2026

0.3.0

Mar 17, 2026

0.2.5

Mar 7, 2026

0.2.4

Mar 7, 2026

0.2.3

Mar 6, 2026

0.2.2

Mar 6, 2026

0.2.1

Mar 6, 2026

0.1.24

Feb 17, 2026

0.1.23

Feb 17, 2026

0.1.22

Feb 17, 2026

0.1.21

Feb 17, 2026

0.1.20

Feb 16, 2026

0.1.19

Feb 16, 2026

0.1.18

Feb 16, 2026

0.1.17

Feb 16, 2026

0.1.16

Feb 16, 2026

0.1.15

Feb 16, 2026

0.1.14

Feb 16, 2026

0.1.13

Feb 16, 2026

0.1.12

Feb 16, 2026

0.1.11

Feb 16, 2026

0.1.10

Feb 16, 2026

This version

0.1.9

Feb 16, 2026

0.1.8

Feb 16, 2026

0.1.7

Feb 15, 2026

0.1.6

Feb 15, 2026

0.1.5

Feb 15, 2026

0.1.4

Feb 15, 2026

0.1.3

Feb 15, 2026

0.1.2

Feb 15, 2026

0.1.1

Feb 15, 2026

0.1.0

Feb 15, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

haute-0.1.9.tar.gz (8.1 MB view details)

Uploaded Feb 16, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

haute-0.1.9-py3-none-any.whl (632.6 kB view details)

Uploaded Feb 16, 2026 Python 3

File details

Details for the file haute-0.1.9.tar.gz.

File metadata

Download URL: haute-0.1.9.tar.gz
Upload date: Feb 16, 2026
Size: 8.1 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.9.18 {"installer":{"name":"uv","version":"0.9.18","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for haute-0.1.9.tar.gz
Algorithm	Hash digest
SHA256	`ddd26177752d34dedd977f2c37cbdd828c352dba427e2af96bc48f2848072b6c`
MD5	`26b8742f222a363b960778255e574199`
BLAKE2b-256	`b7d62ece2e5e65251a80e165c0783f5319a524578aa964e1d7df48b4d1e05424`

See more details on using hashes here.

File details

Details for the file haute-0.1.9-py3-none-any.whl.

File metadata

Download URL: haute-0.1.9-py3-none-any.whl
Upload date: Feb 16, 2026
Size: 632.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.9.18 {"installer":{"name":"uv","version":"0.9.18","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for haute-0.1.9-py3-none-any.whl
Algorithm	Hash digest
SHA256	`303d95771f3ebc7cc1a9e596b8b968d7e20afb4ef6753aae7fc299986f0ef6c9`
MD5	`dae7ede564f67b77ea0222f2a5c60193`
BLAKE2b-256	`696302f3f6d0915e4a3efc9400086b61de2b9acce6991ecfe717765f798c3a44`

See more details on using hashes here.

haute 0.1.9

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

Haute

What is Haute?

Quick Start

1. Install

2. Create a project

3. Write a pipeline

4. Run it

5. Open the GUI

6. Deploy

Deployment

How it works

Configuration

Test quotes

Dry run

Calling the deployed API

Key Concepts

Pipelines

Edges

Fan-out / Fan-in

Scoring

Node Types

Data Source

Transform

External File

Output

Data Sink

GUI

Code ↔ GUI sync

Pipeline Imports & Helpers

CLI Reference

haute serve options

haute deploy options

Project Structure

Design Principles

Requirements

License

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`haute serve` options

`haute deploy` options