Skip to main content

A lightweight library of verified inference pipelines for HuggingFace models.

Project description

License bolo-templates release PyPI

bolo logo

Bolo: Curated, Verified, and Ready-to-run Inference Pipelines for HuggingFace Models

Bolo is a lightweight Python library that gives you curated, verified, and ready-to-run inference pipelines for HuggingFace models with ZERO efforts.

Development

  • Each issue must be labeled with a label.
  • Every bug issue will only be closed when PyPI is updated.
  • For bug issue, please refer to tracker issue for progress.
  • For TODO issue, please refer to Notion sprint for progress (each TODO issue contains the specific sprint page).

📦 Curated templates: every supported model ships with a tested Jinja2 inference template maintained by the Illinois CreateLab team.

🔒 Isolated venvs: each model runs inside its own uv-managed virtual environment, so dependency conflicts between models are impossible.

One-call API: bolo.pipeline(repo_id, device="cuda", ...) conducts inference in one API call, simpler than HuggingFace two-stages pipeline API.

🖥️ CLI included: a bolo command lets you manage venvs and run inference directly from the shell.

🌐 Auto-fetching templates: template bundles are downloaded on first use from the bolo-templates GitHub release and cached locally — no manual setup needed.

Quick demo

Install bolo:

pip install pybolo

Run inference with two lines of Python:

import bolo

result = bolo.pipeline("<REPO_ID>", device="cuda:0")
print(result)

Before running inference you can inspect what parameters the template accepts:

bolo.list_params("<REPO_ID>")

And manage venvs explicitly:

python_bin = bolo.create_a_venv("<REPO_ID>")
# ... activate the created venv and do inference ...
bolo.remove_venv("<REPO_ID>")

CLI

The bolo command mirrors the Python API from your shell.

Create an isolated venv for a model:

bolo create-venv <REPO_ID>
bolo create-venv <REPO_ID> --venv-path /path/to/venv
# then activate the venv

Run inference:

bolo run <REPO_ID> device=cuda:0

Pre-download the templates cache (optional, useful on air-gapped machines):

bolo fetch-templates

How does bolo work?

bolo separates what to run (the Jinja2 template) from where to run it (the model's isolated venv).

flowchart LR
    A[User calls bolo.pipeline] --> B[Fetch / load templates]
    B --> C[Render Jinja2 template\nwith user params]
    C --> D[Execute rendered script\ninside model venv]
    D --> E[Return RESULT]
  1. Templates — stored in the bolo-templates release bundle. Each model folder contains a template.j2 (the inference script template) and a requirements.txt (the model's exact dependencies). Templates are downloaded once and cached at ~/.cache/bolo/templates/.

  2. Venvs — created with uv venv + uv pip install -r requirements.txt. Every model gets its own venv so you can safely use models with conflicting PyTorch or CUDA versions side-by-side.

  3. Rendering — template parameters are collected from the leading {% set key = default %} blocks in each template.j2. bolo list_params shows you every knob with its type and default value.

  4. Execution — the rendered script is executed and its RESULT variable is returned to the caller.

Install from source

git clone https://github.com/illinoisdata/Bolo.git
cd Bolo
pip install -e .

Custom templates directory

Set BOLO_TEMPLATES_DIR to point bolo at your own templates folder:

export BOLO_TEMPLATES_DIR=/path/to/my/templates

Platforms

bolo is developed and tested on Red Hat Enterprise Linux, with CUDA version 12.8. So all dependencies (torch-related) assume the cu128.

Contributing

Contributions are welcome! Please open an issue or pull request on GitHub.

License

MIT — see LICENSE for details.

Acknowledgement

  • Thanks Jojo and her sister for designing the mascot.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pybolo-0.1.1.tar.gz (11.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pybolo-0.1.1-py3-none-any.whl (9.6 kB view details)

Uploaded Python 3

File details

Details for the file pybolo-0.1.1.tar.gz.

File metadata

  • Download URL: pybolo-0.1.1.tar.gz
  • Upload date:
  • Size: 11.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.11

File hashes

Hashes for pybolo-0.1.1.tar.gz
Algorithm Hash digest
SHA256 9b5e804eefbbbd3f2cd912d4eb2980e3452be17a852b06a5fb101a25f9a60543
MD5 295d56fca9e22aed775d363d85378b96
BLAKE2b-256 3be525b85192bca8cdca2e6a66bb794c425df34291e29495e1fc495e132549c0

See more details on using hashes here.

File details

Details for the file pybolo-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: pybolo-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 9.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.11

File hashes

Hashes for pybolo-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 ffeba274d33ca2b00cd4db369a8d64f7c27b9ca3f8ac86b1e145c27a0eda24c1
MD5 09261f3d3c7270627a2fdd7c422f0f6b
BLAKE2b-256 276b0eb33a4dff644dafefcae83103f5fa26cd931746dffb3cd939f5bd9656dd

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page