One-command ONNX model optimizer for web deployment

Project description

onnx-web-optimizer

One-command ONNX model optimizer for web deployment.

Runs a three-step pipeline — graph simplification → FP16 quantization → ORT conversion — to produce the smallest, fastest-loading model for use with onnxruntime-web.

Why

Loading a raw ONNX model in the browser is slow. This tool automates the steps most teams do manually:

Step	Tool	Typical Result
Graph simplification	`onnxsim`	Removes redundant nodes
FP16 quantization	`onnxconverter-common`	~50% size reduction
ORT format conversion	`onnxruntime`	Faster in-browser parse time

A 168 MB U2Net model becomes ~85 MB after the full pipeline.

Installation

pip install onnx-web-optimizer

Usage

# Full pipeline (recommended)
onnx-web-opt model.onnx -o ./output

# With verbose output
onnx-web-opt model.onnx -o ./output --verbose

# Skip individual steps
onnx-web-opt model.onnx --skip-simplify
onnx-web-opt model.onnx --skip-fp16
onnx-web-opt model.onnx --skip-ort

# Check version
onnx-web-opt --version

Output

onnx-web-optimizer 0.1.0
Input  : /path/to/model.onnx
Output : /path/to/output
────────────────────────────────────────────────────
[1/3] Simplifying model graph...
✓ Simplification complete: output/model_simplified.onnx
[2/3] Quantizing to FP16...
✓ FP16 quantization complete: output/model_simplified_fp16.onnx
[3/3] Converting to .ort format...
✓ ORT conversion complete: output/model_simplified_fp16.ort
────────────────────────────────────────────────────
✅ Done. Output: output/model_simplified_fp16.ort

Loading the Optimized Model in the Browser

import * as ort from 'onnxruntime-web';

const resp = await fetch('https://your-cdn.com/model_simplified_fp16.ort');
const buffer = await resp.arrayBuffer(); // must be ArrayBuffer, not a ReadableStream

const session = await ort.InferenceSession.create(buffer, {
  executionProviders: ['webgpu', 'wasm'],
  graphOptimizationLevel: 'all',
});

Requirements

Python >= 3.8
onnxruntime >= 1.16.0

Real-World Usage

BulkPicTools uses onnx-web-optimizer as part of its AI model deployment pipeline.

BulkPicTools is a privacy-first, browser-based bulk image processing tool — all processing runs locally in the user's browser via WebAssembly. No images are ever uploaded to a server.

The AI-powered features (e.g. Background Remover) rely on ONNX models optimized with this tool before being served to end users.

Optimized Model Registry

Models currently optimized and deployed via this tool in production:

Model	Task	License	Original Source	Optimizations Applied
U2Net	Background Removal	Apache 2.0	xuebinqin/U-2-Net	onnxsim → FP16 → .ort

✅ All models are open-source with MIT or Apache 2.0 licenses.
🔒 Models are served via Cloudflare R2 and run entirely in the user's browser — no data is sent to any server.

Who Is This For?

If you are building a browser-based AI tool and need to:

Reduce model download size for end users
Speed up model initialization in the browser
Serve optimized models from a CDN (Cloudflare R2, S3, etc.)

...then this tool is for you.

License

Project details

Release history Release notifications | RSS feed

This version

0.1.0

Apr 21, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

onnx_web_optimizer-0.1.0.tar.gz (7.1 kB view details)

Uploaded Apr 21, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

onnx_web_optimizer-0.1.0-py3-none-any.whl (7.9 kB view details)

Uploaded Apr 21, 2026 Python 3

File details

Details for the file onnx_web_optimizer-0.1.0.tar.gz.

File metadata

Download URL: onnx_web_optimizer-0.1.0.tar.gz
Upload date: Apr 21, 2026
Size: 7.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for onnx_web_optimizer-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`b5a0e703d8aaeb6e97c20ceb951140f6868a2c6e38c54697f213274f776e9228`
MD5	`c9ff39fbde836eba5badd7a3d1ab846a`
BLAKE2b-256	`40740ee5cdb61ebc165ab893c8667023853cf4bef757d288e5a421f13ddbc7f4`

See more details on using hashes here.

File details

Details for the file onnx_web_optimizer-0.1.0-py3-none-any.whl.

File metadata

Download URL: onnx_web_optimizer-0.1.0-py3-none-any.whl
Upload date: Apr 21, 2026
Size: 7.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for onnx_web_optimizer-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`69ddcbb3cc0b8a1d94b798d2e42db688ad6c45495cb377fc8d1d5208e8abd6aa`
MD5	`f82d7e55cc96bc1fdcd184fcd0b33dbe`
BLAKE2b-256	`f23b0f0d4ccf185c58a68b77df5a58ceb636d47b3e44dc00802780db953b0d83`

See more details on using hashes here.

onnx-web-optimizer 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

onnx-web-optimizer

Why

Installation

Usage

Output

Loading the Optimized Model in the Browser

Requirements

Real-World Usage

Optimized Model Registry

Who Is This For?

License

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes