Flouds Model Exporter: ONNX export and optimization utilities

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

gmalakar

These details have not been verified by PyPI

Project description

Python Versions License Build

flouds_model_exporter

Production-grade ONNX model export toolkit for HuggingFace transformers.

Overview

flouds_model_exporter provides a unified pipeline for converting HuggingFace models to optimized ONNX format:

Universal Export - Supports embedding models, seq2seq, classification, and large language models (LLMs)
Smart Optimization - Automatic ONNX optimization with configurable levels and portability modes
Robust Validation - Numeric verification ensuring export accuracy before deployment
Large Model Support - External-data format, subprocess isolation, and memory management for multi-GB models
Batch Orchestration - Python-native batch subcommand with YAML-driven presets for automated multi-model export workflows
Fallback Strategies - Opset retry, explicit remote-code trust handling, and export error recovery

Quick Start

Installation

From PyPI (recommended)

pip install flouds-model-exporter

From source

# Clone the repository
git clone https://github.com/gmalakar/flouds_model_exporter.git
cd flouds_model_exporter

# Create a Python 3.12 virtual environment (3.11 is also supported)
py -3.12 -m venv .venv
.\.venv\Scripts\Activate.ps1

# Install the package and all dependencies
pip install -e .

# (Optional) Install developer tooling
pip install -e ".[dev]"

After installation the CLI entry point is available:

flouds-export export --help

Environment Variables

Use environment variables to control default output location and Hugging Face authentication.

ONNX_PATH

ONNX_PATH sets the default ONNX output root used by export workflows.

Windows PowerShell (current session):

$Env:ONNX_PATH = "C:\path\to\onnx\models"

Linux/macOS (current shell):

export ONNX_PATH="/path/to/onnx/models"

HUGGINGFACE_TOKEN

HUGGINGFACE_TOKEN provides an access token for private/gated Hugging Face model downloads.

Windows PowerShell (current session):

$Env:HUGGINGFACE_TOKEN = "hf_xxx_your_token"

Linux/macOS (current shell):

export HUGGINGFACE_TOKEN="hf_xxx_your_token"

You can also pass a token directly per command with --hf-token.

Persisting Variables

Windows (future terminals):

setx ONNX_PATH "C:\path\to\onnx\models"
setx HUGGINGFACE_TOKEN "hf_xxx_your_token"

Linux/macOS (bash/zsh profile):

echo 'export ONNX_PATH="/path/to/onnx/models"' >> ~/.bashrc
echo 'export HUGGINGFACE_TOKEN="hf_xxx_your_token"' >> ~/.bashrc

Verify Values

Windows PowerShell:

echo $Env:ONNX_PATH
echo $Env:HUGGINGFACE_TOKEN

Linux/macOS:

echo "$ONNX_PATH"
echo "$HUGGINGFACE_TOKEN"

Security note: never commit real tokens to source control. Rotate any exposed token immediately.

Export a Model

Embedding model (Feature Extraction):

flouds-export export `
  --model-name sentence-transformers/all-MiniLM-L6-v2 `
  --model-for fe `
  --task feature-extraction `
  --optimize

Seq2seq model (T5, BART):

flouds-export export `
  --model-name t5-small `
  --model-for s2s `
  --task seq2seq-lm `
  --optimize

Ranker model (Cross-Encoder):

flouds-export export `
  --model-name cross-encoder/ms-marco-MiniLM-L-12-v2 `
  --model-for ranker `
  --task sequence-classification `
  --optimize

Large Language Model (with KV-cache):

flouds-export export `
  --model-name deepseek-ai/deepseek-coder-1.3b-instruct `
  --model-for llm `
  --task text-generation-with-past `
  --use-external-data-format `
  --use-sub-process `
  --use-fallback-if-failed `
  --optimize `
  --merge

Batch Export

Export all configured models with optimizations:

flouds-export batch --preset recommended --optimize --cleanup --portable

Wrapper script reference: see docs/WRAPPER_SCRIPTS.md for complete parameter documentation for run_exports.ps1 and run_exports.sh.

Windows users can still use .\run_exports.ps1, which forwards to the Python CLI batch subcommand. Batch presets are loaded from src/model_exporter/config/policy.yaml, and you can point to a custom YAML file with --config.

Linux/macOS users can use ./run_exports.sh with the same batch concepts:

chmod +x ./run_exports.sh
./run_exports.sh --config ./docs/batch_presets_example.yaml --preset text-import --fail-fast

Note: the --suppress-warning wrapper/CLI option has been removed. To control logging behavior, use --log-to-file (or -LogToFile for the PowerShell wrapper) to request per-export log files and tee stdout/stderr into the logfile. By default the exporter logs to the terminal only.

Batch Examples (YAML and Text File)

YAML preset example file:

docs/batch_presets_example.yaml

Run using YAML preset:

.\run_exports.ps1 -Config .\docs\batch_presets_example.yaml -Preset text-import -FailFast

Text command list example file:

docs/batch_commands.txt

Run using text file import:

.\run_exports.ps1 -TextFile .\docs\batch_commands.txt -Preset text-import -FailFast

Note: text file entries must use the new hyphenated CLI flags, such as --opset-version; underscored flag names are rejected.

Validate An Export

Validate an exported ONNX model against its reference Hugging Face model:

flouds-export validate --model-dir onnx/models/fe/all-MiniLM-L6-v2 --reference-model sentence-transformers/all-MiniLM-L6-v2 --normalize-embeddings

Optimize Existing Exported Models

Run the shared optimizer service against an already-exported ONNX directory:

flouds-export optimize --model-dir onnx/models/fe/all-MiniLM-L6-v2 --model-for fe --optimization-level 2 --portable

Python API

After installing the package you can call the exporter directly from Python without using the CLI.

Basic usage

If ONNX_PATH is set, you can omit onnx_path and the exporter will use it automatically:

import os
os.environ["ONNX_PATH"] = "/path/to/onnx/models"  # or set it before launching Python

from model_exporter.export.pipeline import export

output_dir = export(
    model_name="sentence-transformers/all-MiniLM-L6-v2",
    model_for="fe",
    task="feature-extraction",
    optimize=True,
    # onnx_path not needed; picked up from ONNX_PATH env var
)
print(f"Exported to: {output_dir}")

Or pass onnx_path explicitly to override the environment variable:

output_dir = export(
    model_name="sentence-transformers/all-MiniLM-L6-v2",
    model_for="fe",
    task="feature-extraction",
    onnx_path="./custom/onnx",  # overrides ONNX_PATH
    optimize=True,
)
print(f"Exported to: {output_dir}")

Seq2seq (T5, BART)

export(
    model_name="t5-small",
    model_for="s2s",
    task="seq2seq-lm",
    optimize=True,
)

Large model with subprocess isolation

export(
    model_name="meta-llama/Llama-2-7b-hf",
    model_for="llm",
    task="text-generation-with-past",
    use_external_data_format=True,
    use_subprocess=True,
    use_fallback_if_failed=True,
    merge=True,
    hf_token="hf_xxx_your_token",  # for gated models
)

API reference

Parameter	Type	Default	Description
`model_name`	`str`	required	HuggingFace model ID or local path
`model_for`	`str`	`"fe"`	`fe`, `s2s`, `sc`, `llm`, `ranker`
`task`	`str`	`None`	e.g. `feature-extraction`, `seq2seq-lm`, `sequence-classification`
`onnx_path`	`str`	`"onnx"`	Output directory
`optimize`	`bool`	`False`	Run ONNX optimizer after export
`optimization_level`	`int`	`99`	ORT optimization level (`0`, `1`, `2`, or `99`)
`opset_version`	`int`	auto	ONNX opset version
`device`	`str`	`"cpu"`	`cpu` or `cuda`
`framework`	`str`	`None`	`pt` or `tf`
`trust_remote_code`	`bool`	`False`	Allow custom model code
`use_external_data_format`	`bool`	`False`	Split model for >2GB exports
`use_subprocess`	`bool`	`None`	Run export in isolated subprocess
`use_fallback_if_failed`	`bool`	`False`	Enable legacy fallback only if primary export fails
`merge`	`bool`	`False`	Merge decoder artifacts (LLMs)
`pack_single_file`	`bool`	`False`	Repack external-data into single file
`normalize_embeddings`	`bool`	`False`	L2-normalize before validation
`skip_validator`	`bool`	`False`	Skip numeric validation
`require_validator`	`bool`	`False`	Fail if validation cannot run
`quantize`	any	`False`	Quantization configuration
`hf_token`	`str`	`None`	HuggingFace auth token (via `**kwargs`)

CLI Reference

Core Parameters

Parameter	Values	Description
`--model-name`	`str`	HuggingFace model ID or local path
`--model-for`	`fe`, `s2s`, `sc`, `ranker`, `llm`	Model type: embedding, seq2seq, classification, ranker (cross-encoder), or language model
`--task`	`str`	Export task: `feature-extraction`, `seq2seq-lm`, `sequence-classification`, `text-generation-with-past`, etc.
`--framework`	`pt`, `tf`	Framework: PyTorch or TensorFlow
`--device`	`cpu`, `cuda`	Target device
`--opset-version`	`11`, `14`, `17`, `18`	ONNX opset version (default: 17)
`--trust-remote-code`	flag	Allow custom model code execution

Export Configuration

Parameter	Default	Description
`--framework`	`pt`	Framework: `pt` (PyTorch) or `tf` (TensorFlow)
`--device`	`cpu`	Target device: `cpu` or `cuda`
`--opset-version`	`17`	ONNX opset version (11, 14, or 17)
`--trust-remote-code`	`false`	Allow custom model code execution. Review the model code first.
`--force`	`false`	Overwrite existing exports

Optimization & Validation

Parameter	Description
`--optimize`	Enable post-export ONNX optimization
`--optimization-level`	Optimization level: `0`, `1`, `2`, or `99`. Requires `--optimize`; default when optimizing is 99
`--portable`	Use conservative optimizations for cross-platform compatibility. Requires `--optimize`
`--skip-validator`	Skip numeric validation
`--require-validator`	Fail build if validation fails
`--normalize-embeddings`	L2-normalize embeddings during validation

--skip-validator cannot be combined with --require-validator or --normalize-embeddings.

The standalone optimize subcommand accepts --model-dir, --model-for, --optimization-level, and --portable so you can re-run optimization without repeating export. The batch subcommand also accepts --optimization-level as a global override for all batch exports. Batch global overrides follow the same dependency rules: --optimization-level and --portable require --optimize, --prune-canonical requires --cleanup, and --no-local-prep must be set per LLM entry instead of globally.

Large Model Options

Parameter	Description
`--use-external-data-format`	Split model into .onnx + .onnx_data files (for >2GB models)
`--use-sub-process`	Run export in isolated subprocess (safer for large models)
`--use-fallback-if-failed`	Enable legacy fallback exporter only if primary export fails
`--no-post-process`	Skip ONNX post-processing (reduces memory usage)
`--low-memory-env`	Apply low-memory export settings; implies external data format and disabled post-processing
`--pack-single-file`	Repack external-data model into single file during validation

Do not combine --low-memory-env with --use-external-data-format or --no-post-process; those settings are already implied.

Advanced Options

Parameter	Description
`--merge`	Merge decoder artifacts for LLMs (with-past only). Requires `--model-for llm`
`--no-local-prep`	Skip local model preparation for LLMs. Requires `--model-for llm`
`--cleanup`	Remove temporary/extraneous files post-export
`--prune-canonical`	Remove canonical models when merged version exists. Requires `--cleanup`
`--hf-token`	HuggingFace API token for private models
`--onnx-path`	Custom output directory (default: `./onnx`)

Output Structure

Exported models are organized by type and name:

onnx/models/
+-- fe/                              # Feature extraction (embeddings)
|   +-- all-MiniLM-L6-v2/
|   |   +-- model.onnx
|   +-- bge-small-en-v1.5/
|       +-- model.onnx
|       +-- model.onnx_data          # External data (if >2GB)
+-- s2s/                             # Seq2seq models
|   +-- t5-small/
|   |   +-- encoder_model.onnx
|   |   +-- decoder_model.onnx
|   |   +-- decoder_with_past_model.onnx
|   +-- bart-large-cnn/
+-- llm/                             # Large language models
    +-- deepseek-coder-1.3b-instruct/
    |   +-- model.onnx
    |   +-- model.onnx_data
    |   +-- model_merged.onnx        # Merged version (if --merge used)
    +-- phi-3-mini-4k-instruct/

Architecture

Directory Structure

src/model_exporter/
+-- cli/                            # CLI entrypoints and subcommands
+-- config/                         # Logging and batch policy
+-- export/                         # Export pipeline, helpers, optimizer, subprocess runner
+-- utils/                          # Diagnostics and helper utilities
+-- validation/                     # Structural and numeric validation

Export Pipeline

Preparation - Token setup, model validation, output directory creation
Export - optimum.exporters.onnx.main_export with fallback strategies
Validation - Structural checks + numeric validation (input/output comparison)
Optimization - ONNX Runtime optimization passes (optional)
Cleanup - Remove temporary files, prune redundant artifacts

Memory Management

Subprocess Isolation

For large models, use subprocess isolation to prevent parent process crashes:

flouds-export export `
  --model-name meta-llama/Llama-2-7b-hf `
  --use-sub-process `
  --use-fallback-if-failed `
  --use-external-data-format

Batch Export Memory Monitoring

The batch subcommand monitors available RAM before each export:

# Require at least 4GB free RAM before each export
flouds-export batch --preset recommended --min-free-memory-gb 4

Config-Driven Batch Workflow

The batch runner loads presets from YAML:

flouds-export batch --config src/model_exporter/config/policy.yaml --preset recommended

Each preset entry maps directly to export CLI arguments, which makes export pipelines deterministic and versionable.

Large Model Best Practices

For models >2GB:

Enable external data format - Splits model into .onnx + .onnx_data
Use subprocess isolation - Prevents memory leaks affecting subsequent exports
Skip post-processing - Reduces peak memory during export
Lower opset version - Simplifies optimization (try opset 11)

flouds-export export `
  --model-name gpt2-large `
  --use-external-data-format `
  --use-sub-process `
  --use-fallback-if-failed `
  --no-post-process `
  --opset-version 11

Troubleshooting

Issue	Solution
`ModuleNotFoundError: optimum`	Install runtime dependencies: `pip install -r requirements-prod.txt`
`MemoryError` or OOM crashes	Use `--use-sub-process` and `--use-external-data-format`; reduce `--optimization-level`
Primary export fails on edge models	Retry with `--use-fallback-if-failed` to enable legacy fallback path
`RuntimeError: > 2GiB protobuf`	Enable `--use-external-data-format`
`ValueError: Unsupported opset`	Lower `--opset-version` to 14 or 11
`TracerWarning: Converting tensor`	Model tracing limitation (usually safe to ignore)
Validation failures	Check numeric precision; try `--skip-validator` for known issues
`trust_remote_code required`	Add `--trust-remote-code` flag (review model code first)

Export Logs

By default, export logs go to the terminal only. When --log-to-file or log_to_file=True is set, per-model timestamped logs are written to logs/onnx_exports/ under the current working directory. Set FLOUDS_LOG_DIR to write file logs somewhere else.

Requirements

Python: 3.11 or 3.12
System: 8GB+ RAM (16GB+ for large models)
Dependencies: See requirements-prod.txt (runtime) and requirements-dev.txt (development)

Contributing

See CONTRIBUTING.md for contribution workflow and local development checks. For expected behavior and standards, see CODE_OF_CONDUCT.md and SECURITY.md. Maintainer release steps are documented in docs/RELEASE_PROCESS.md.

License

Licensed under the Apache License, Version 2.0 (Apache-2.0). See LICENSE for details.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

gmalakar

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.2.0

Jun 9, 2026

0.1.0

Apr 19, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

flouds_model_exporter-0.2.0.tar.gz (112.0 kB view details)

Uploaded Jun 9, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

flouds_model_exporter-0.2.0-py3-none-any.whl (105.2 kB view details)

Uploaded Jun 9, 2026 Python 3

File details

Details for the file flouds_model_exporter-0.2.0.tar.gz.

File metadata

Download URL: flouds_model_exporter-0.2.0.tar.gz
Upload date: Jun 9, 2026
Size: 112.0 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for flouds_model_exporter-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`1ccd6119fba807bc9fb5cb8a7ae27af06c97735e16197107d7fa9ab39792a412`
MD5	`d1012ab04c8ba9db8cd5db46b0d7a0c4`
BLAKE2b-256	`56c07a4d22353871a1d5d2912fa8927763d9d66ad3cda44c00170694d8d85e68`

See more details on using hashes here.

Provenance

The following attestation bundles were made for flouds_model_exporter-0.2.0.tar.gz:

Publisher: release.yml on gmalakar/flouds_model_exporter

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: flouds_model_exporter-0.2.0.tar.gz
- Subject digest: 1ccd6119fba807bc9fb5cb8a7ae27af06c97735e16197107d7fa9ab39792a412
- Sigstore transparency entry: 1762460117
- Sigstore integration time: Jun 9, 2026
Source repository:
- Permalink: gmalakar/flouds_model_exporter@bc08f9120a22f9921cc889143cf7c5e9621d1000
- Branch / Tag: refs/heads/main
- Owner: https://github.com/gmalakar
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@bc08f9120a22f9921cc889143cf7c5e9621d1000
- Trigger Event: workflow_dispatch

File details

Details for the file flouds_model_exporter-0.2.0-py3-none-any.whl.

File metadata

Download URL: flouds_model_exporter-0.2.0-py3-none-any.whl
Upload date: Jun 9, 2026
Size: 105.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for flouds_model_exporter-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`afdf9b362688b58e4865d6fa2ccc3f3b920da37f820427e05d18f943522e55e9`
MD5	`849eb7bb5b35bd361642d9630150341a`
BLAKE2b-256	`ffebec9973ff330ae572459183835baadc6f9e4c935dfe5d77db9cd236a77c0a`

See more details on using hashes here.

Provenance

The following attestation bundles were made for flouds_model_exporter-0.2.0-py3-none-any.whl:

Publisher: release.yml on gmalakar/flouds_model_exporter

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: flouds_model_exporter-0.2.0-py3-none-any.whl
- Subject digest: afdf9b362688b58e4865d6fa2ccc3f3b920da37f820427e05d18f943522e55e9
- Sigstore transparency entry: 1762460244
- Sigstore integration time: Jun 9, 2026
Source repository:
- Permalink: gmalakar/flouds_model_exporter@bc08f9120a22f9921cc889143cf7c5e9621d1000
- Branch / Tag: refs/heads/main
- Owner: https://github.com/gmalakar
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@bc08f9120a22f9921cc889143cf7c5e9621d1000
- Trigger Event: workflow_dispatch

flouds-model-exporter 0.2.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

flouds_model_exporter

Overview

Quick Start

Installation

From PyPI (recommended)

From source

Environment Variables

ONNX_PATH

HUGGINGFACE_TOKEN

Persisting Variables

Verify Values

Export a Model

Batch Export

Batch Examples (YAML and Text File)

Validate An Export

Optimize Existing Exported Models

Python API

Basic usage

Seq2seq (T5, BART)

Large model with subprocess isolation

API reference

CLI Reference

Core Parameters

Export Configuration

Optimization & Validation

Large Model Options

Advanced Options

Output Structure

Architecture

Directory Structure

Export Pipeline

Memory Management

Subprocess Isolation

Batch Export Memory Monitoring

Config-Driven Batch Workflow

Large Model Best Practices

Troubleshooting

Export Logs

Requirements

Contributing

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance