Skip to main content

One-command ONNX model optimizer for web deployment

Project description

onnx-web-optimizer

One-command ONNX model optimizer for web deployment.

Runs a three-step pipeline — graph simplification → FP16 quantization → ORT conversion — to produce the smallest, fastest-loading model for use with onnxruntime-web.


Why

Loading a raw ONNX model in the browser is slow. This tool automates the steps most teams do manually:

Step Tool Typical Result
Graph simplification onnxsim Removes redundant nodes
FP16 quantization onnxconverter-common ~50% size reduction
ORT format conversion onnxruntime Faster in-browser parse time

A 168 MB U2Net model becomes ~85 MB after the full pipeline.


Installation

pip install onnx-web-optimizer

Usage

# Full pipeline (recommended)
onnx-web-opt model.onnx -o ./output

# With verbose output
onnx-web-opt model.onnx -o ./output --verbose

# Skip individual steps
onnx-web-opt model.onnx --skip-simplify
onnx-web-opt model.onnx --skip-fp16
onnx-web-opt model.onnx --skip-ort

# Check version
onnx-web-opt --version

Output

onnx-web-optimizer 0.1.0
Input  : /path/to/model.onnx
Output : /path/to/output
────────────────────────────────────────────────────
[1/3] Simplifying model graph...
✓ Simplification complete: output/model_simplified.onnx
[2/3] Quantizing to FP16...
✓ FP16 quantization complete: output/model_simplified_fp16.onnx
[3/3] Converting to .ort format...
✓ ORT conversion complete: output/model_simplified_fp16.ort
────────────────────────────────────────────────────
✅ Done. Output: output/model_simplified_fp16.ort

Loading the Optimized Model in the Browser

import * as ort from 'onnxruntime-web';

const resp = await fetch('https://your-cdn.com/model_simplified_fp16.ort');
const buffer = await resp.arrayBuffer(); // must be ArrayBuffer, not a ReadableStream

const session = await ort.InferenceSession.create(buffer, {
  executionProviders: ['webgpu', 'wasm'],
  graphOptimizationLevel: 'all',
});

Requirements

  • Python >= 3.8
  • onnxruntime >= 1.16.0

Real-World Usage

BulkPicTools uses onnx-web-optimizer as part of its AI model deployment pipeline.

BulkPicTools is a privacy-first, browser-based bulk image processing tool — all processing runs locally in the user's browser via WebAssembly. No images are ever uploaded to a server.

The AI-powered features (e.g. Background Remover) rely on ONNX models optimized with this tool before being served to end users.

Optimized Model Registry

Models currently optimized and deployed via this tool in production:

Model Task License Original Source Optimizations Applied
U2Net Background Removal Apache 2.0 xuebinqin/U-2-Net onnxsim → FP16 → .ort

✅ All models are open-source with MIT or Apache 2.0 licenses.
🔒 Models are served via Cloudflare R2 and run entirely in the user's browser — no data is sent to any server.


Who Is This For?

If you are building a browser-based AI tool and need to:

  • Reduce model download size for end users
  • Speed up model initialization in the browser
  • Serve optimized models from a CDN (Cloudflare R2, S3, etc.)

...then this tool is for you.


License

MIT © 2026 kbmjj123

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

onnx_web_optimizer-0.1.0.tar.gz (7.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

onnx_web_optimizer-0.1.0-py3-none-any.whl (7.9 kB view details)

Uploaded Python 3

File details

Details for the file onnx_web_optimizer-0.1.0.tar.gz.

File metadata

  • Download URL: onnx_web_optimizer-0.1.0.tar.gz
  • Upload date:
  • Size: 7.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for onnx_web_optimizer-0.1.0.tar.gz
Algorithm Hash digest
SHA256 b5a0e703d8aaeb6e97c20ceb951140f6868a2c6e38c54697f213274f776e9228
MD5 c9ff39fbde836eba5badd7a3d1ab846a
BLAKE2b-256 40740ee5cdb61ebc165ab893c8667023853cf4bef757d288e5a421f13ddbc7f4

See more details on using hashes here.

File details

Details for the file onnx_web_optimizer-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for onnx_web_optimizer-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 69ddcbb3cc0b8a1d94b798d2e42db688ad6c45495cb377fc8d1d5208e8abd6aa
MD5 f82d7e55cc96bc1fdcd184fcd0b33dbe
BLAKE2b-256 f23b0f0d4ccf185c58a68b77df5a58ceb636d47b3e44dc00802780db953b0d83

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page