Skip to main content

bitHuman avatar runtime + bundled CLI — `bithuman run model.imx` for talk-to-your-avatar quickstart (cloud), `BITHUMAN_LOCAL=1 bithuman run` with the [local] extra for the on-device brain.

Project description

bithuman

This is the Python flavor of Layer 3: a platform-specific library for app developers. It wraps the Layer 1 libessence engine. For the CLI tool see docs/CLI.md.

┌─────────────────────────────────────────────────────────────┐
│ Layer 3: Platform-specific libraries (app developers)       │
│   - Python wheel       pip install bithuman    ◄──── you are here
│   - Swift package      SwiftPM Bithuman                     │
│   - Kotlin AAR         ai.bithuman:sdk                      │
│   - (future) Rust crate, JS/TS, Go, ...                     │
└─────────────────────────────────────────────────────────────┘
                          ▼ embeds
┌─────────────────────────────────────────────────────────────┐
│ Layer 2: bithuman CLI (end-user tool)                       │
│   - one cross-platform binary on macOS / Linux / Windows    │
│   - brew install bithuman · curl-pipe installer             │
└─────────────────────────────────────────────────────────────┘
                          ▼ links
┌─────────────────────────────────────────────────────────────┐
│ Layer 1: libessence engine (cross-platform C++ core)        │
│   - portable C ABI, same source on every target             │
│   - macOS · iOS · Android · Linux · Windows                 │
│   - never imported directly by app developers               │
└─────────────────────────────────────────────────────────────┘

Python bindings for the bitHuman SDK — the portable C++ avatar engine (libessence) that powers our cross-platform lipsync pipeline. The wheel ships a native pybind11 module that talks directly to libessence, so you get the same per-frame cost as our Swift and Kotlin clients with none of the GIL noise.

On an Apple M5 with 24 GB unified memory we measure ~640 FPS sustained compose (1.56 ms/frame mean, 2.03 ms p99) for a 1248×704 avatar, with ~206 MB peak RSS end-to-end. Cold load is ~14 ms for the fixture and ~400 ms for the first compose tick (lazy ONNX init).

This package is namespace-isolated from the v0 bithuman SDK; you can install both side-by-side.

Install

pip install bithuman

Status. The PyPI bithuman wheel is at v2.2.1 (2026-05-25) shipping the bundled Rust CLI + conversation brain. bithuman run is the fast-path live-avatar command; bithuman[local] adds a fully on-device brain (whisper.cpp + llama.cpp + Supertonic). The legacy low-level streaming API (AsyncBithuman.push_audio + async for ... in runtime.run()) is still exported for library users — see the legacy quickstart.

Compatibility

  • Platforms: macOS arm64, Linux x86_64, Linux arm64 — all ship as wheels. Windows is tracked for a follow-up.
  • Python: 3.10 – 3.13 (cp310, cp311, cp312, cp313). CPython only.
  • ABI: the published wheel wraps libessence ABI v4. The libessence engine itself is on ABI v6 — that surface is currently exposed via the Swift / Kotlin / Rust bindings only. PyO3 wheel migration in flight.
  • Auth: ships with live heartbeat against api.bithuman.ai baked into libessence. Avatar.load(api_secret=...) is the entry point; BITHUMAN_API_SECRET env var works too. Set BITHUMAN_UNMETERED=1 for dev / parity-test runs.

What you get

The package exposes three API tiers (all importable from bithuman):

Tier Types Use when…
Async AsyncAvatar, AudioChunk, VideoControl, VideoFrame Hosting a service / parity with legacy AsyncBithuman
Sync facade Avatar, ComposedFrame, EP Offline / batch / CLI rendering
Low-level Fixture, Runtime, EP_CPU/EP_AUTO/EP_COREML/EP_NNAPI/EP_QNN Direct C ABI access, custom audio pipeline

Error types: BithumanError (base), TokenError / TokenExpiredError / TokenValidationError / TokenRequestError / AccountStatusError (auth), ModelError / ModelNotFoundError / ModelLoadError / ModelSecurityError / ExpressionModelNotSupported (fixture), RuntimeNotReadyError.

Version info: bithuman.__version__ (Python package), bithuman.__core_version__ (linked libessence), bithuman.__abi_version__.

Quickstart (legacy AsyncBithuman — PyPI)

This is the shape of the current published wheel. It ports directly from the v0 bithuman SDK: feed PCM with push_audio, drain frames from the runtime.run() async generator.

import asyncio
from bithuman import AsyncBithuman

async def main():
    runtime = await AsyncBithuman(
        model_path="model.imx",
        api_secret="...",  # or BITHUMAN_API_SECRET env var
    ).start()

    await runtime.push_audio(pcm_16k_mono_int16_bytes,
                             sample_rate=16000, last_chunk=True)

    async for frame in runtime.run():
        # frame.bgr_image is (H, W, 3) uint8 in BGR order
        ...

asyncio.run(main())

PCM accepted is int16 little-endian bytes at 16 kHz mono. WAV / MP3 / FLAC / OGG decoding is the caller's responsibility (use soundfile).

Quickstart (low-level, C-level streaming surface)

The Rust PyO3 wheel will expose the ABI-v6 streaming pair (runtime.push_audio + runtime.pull_frame) on the same shape as the Swift / Kotlin bindings. Until it ships to PyPI, the snippet below uses the legacy Fixture / Runtime types in the published wheel.

Live avatar — bithuman run (browser + voice chat)

The wheel also ships the bithuman Rust CLI plus an embedded conversation brain (a livekit-agents worker). bithuman run avatar.imx stands up the whole stack — embedded livekit-server, libessence runtime, brain, browser player — and prints a URL you open to talk to the avatar.

pip install bithuman
export BITHUMAN_API_KEY=...     # avatar-runtime auth (https://www.bithuman.ai/#developer)
export OPENAI_API_KEY=sk-...    # the conversational brain (OpenAI Realtime)

bithuman run ~/.cache/bithuman/models/sample-avatar.imx
# → open the printed http://127.0.0.1:8088/<CODE> URL in a browser

Fully on-device — bithuman[local]

For zero-cloud operation, install the [local] extra and set BITHUMAN_LOCAL=1. No OpenAI key, no outbound network — the brain swaps to whisper.cpp (STT) + llama.cpp (LLM) + Supertonic (TTS), all in-process, all auto-downloaded from HuggingFace on first run.

pip install 'bithuman[local]'

export BITHUMAN_API_KEY=...
BITHUMAN_LOCAL=1 bithuman run ~/.cache/bithuman/models/sample-avatar.imx
Slot Library (mobile-portable C++ core) Default model Disk RAM
STT pywhispercpp → whisper.cpp tiny.en 77 MB ~150 MB
LLM llama-cpp-python → llama.cpp Qwen 2.5 0.5B-Instruct Q4_K_M 400 MB ~600 MB
TTS supertonic → ONNX Runtime Supertonic 3 (voice M1, 31 languages) 380 MB ~600 MB
VAD livekit-plugins-silero Silero 5 MB ~50 MB

Total ~860 MB on disk, ~1.5 GB RAM, ~717 ms warm load, ~1.4 s warm end-to-end (STT + LLM + TTS) on Apple Silicon. Cold start adds ~90 s once for first-run model downloads.

Optional knobs (env vars)

Var Default What
BITHUMAN_LOCAL unset =1 flips the brain to the local stack.
BITHUMAN_LOCAL_WHISPER tiny.en whisper.cpp model size (tiny.en / base.en / small / large-v3-turbo).
BITHUMAN_LOCAL_LLM Qwen/Qwen2.5-0.5B-Instruct-GGUF HuggingFace repo id of a GGUF LLM.
BITHUMAN_LOCAL_LLM_FILE qwen2.5-0.5b-instruct-q4_k_m.gguf GGUF file within the repo.
BITHUMAN_LOCAL_VOICE M1 Supertonic voice preset (M1M5 / F1F5).
BITHUMAN_LOCAL_LANG en Supertonic language (31 supported: en, ko, ja, es, de, …).
BITHUMAN_INSTRUCTIONS short default Override the system prompt.

All three local backends have first-party iOS/Android C++ builds, so the same .gguf / .bin / .onnx model files are reusable when porting to mobile — see cpp/bindings/python/livekit_local_plugins/README.md.

CLI

A essence-render console script ships with the wheel:

pip install 'bithuman[cli]'

essence-render \
  --model ~/.cache/bithuman/models/sample-avatar.imx \
  --audio speech.wav \
  --output out.mp4

Pass --output - to stream raw BGR24 frames to stdout (handy for piping into a separate ffmpeg pipeline or a custom encoder). Other flags:

Flag Default Description
--fps 25 Output FPS for the MP4 container.
--quality 80 libx264 quality 1..100 (higher = better).
--ep cpu Execution provider hint (cpu/auto/coreml/…).
--threads 1 ORT intra-op thread count.
--no-audio Skip audio muxing; produce a silent video.

Example end-to-end run (5 s sine sweep):

essence-render 0.1.0: model=sample-avatar.imx audio=sine_sweep_5s.wav ep=cpu threads=1
essence-render: loaded fixture in 14.9 ms — 1248x704 @ 25 fps, 183 clusters, 202 src frames
essence-render: composed 122 frames in 1.83s (14.96 ms/frame, 66.8 fps)
essence-render: wrote /tmp/sine_sweep_5s.mp4

(Throughput here is bounded by H.264 encode, not Essence inference. Use --output - if you want to measure raw compose speed.)

Low-level API

If you need finer control or want to swap in a custom audio pipeline, the C ABI is exposed directly:

import numpy as np
from bithuman import Fixture, Runtime, EP_CPU

fx = Fixture("model.imx", preferred_ep=EP_CPU, intra_op_threads=1)
rt = Runtime(fx)
pcm = np.fromfile("speech.f32", dtype=np.float32)  # 16 kHz mono float32
cluster_idx, bgr = rt.tick_compose(pcm, frame_idx_hint=-1)
# bgr.shape == (fx.frame_height, fx.frame_width, 3), dtype uint8

Pass the entire pcm buffer to each tick_compose call; the runtime maintains an internal cursor and advances one tick per call until the audio is exhausted.

Zero-alloc hot path (since 1.12.4)

For tight render loops, pre-allocate the BGR buffer once and pass it via out=. The runtime writes into it in place and returns just the cluster_idx. This drops wrapper overhead to within ~3 % of raw libessence (vs ~8 % for the alloc-per-tick path):

out = np.empty((fx.frame_height, fx.frame_width, 3), dtype=np.uint8)
for _ in range(num_ticks):
    cluster_idx = rt.tick_compose(pcm, -1, out=out)
    # `out` now holds this tick's frame; read it before the next call.

The same out= keyword works on tick_compose_to_size. See docs/ARCHITECTURE.md §9 for the cross-wrapper perf table.

Build from source

You need the prebuilt parent C++ archive at cpp/build/libessence.a (run the parent CMake build first), plus the runtime deps from Homebrew (onnxruntime, webp, ffmpeg, hdf5, jpeg-turbo).

cd cpp/bindings/python
uv pip install -e '.[cli,test]' --no-build-isolation

The CMake glue links the prebuilt static archive directly — it does NOT re-run the parent build, so iterate on bindings without paying the C++ rebuild cost.

Performance

Measured with tests/bench.py against the v1 compose path (audio → composited BGR frame) on Apple M5 24 GB, libessence 1.16.0:

Metric Alloc per tick out= reuse buffer
Steady-state mean 1.53 ms / frame 1.45 ms / frame
p99 1.66 ms 1.53 ms
Sustained throughput 655 FPS 692 FPS
Overhead vs raw libessence +8.3 % +2.6 %
Peak RSS (proc) 192 MB 182 MB

Wrapper overhead is within 5 % of raw libessence on the out= path; see docs/ARCHITECTURE.md §9 for the apples-to-apples methodology and the cross-wrapper comparison. Reproduce with:

scripts/bench-wrappers.sh

Linux wheels

Pre-built manylinux_2_28 wheels ship for x86_64 + aarch64 across cp310 through cp313 — 8 wheels in total, all auditwheel-repaired with the full dep tree bundled (ORT, FFmpeg, HDF5, libjpeg-turbo, libwebp, libcurl, OpenSSL).

To rebuild them locally:

# One-time: build the dep-baked Docker images (~10 min each).
docker build --platform linux/amd64 -t libessence/manylinux-x86_64:0.1 \
    -f scripts/Dockerfile.manylinux-x86_64 scripts/
docker build --platform linux/arm64/v8 -t libessence/manylinux-aarch64:0.1 \
    -f scripts/Dockerfile.manylinux-aarch64 scripts/

# Per wheel build (~2 min):
docker run --rm --platform linux/amd64 -v "$REPO":/src \
    -e PYTAG=cp311 -e ARCH_INSIDE=x86_64 \
    libessence/manylinux-x86_64:0.1 \
    bash /src/cpp/bindings/python/scripts/build-wheel-in-container.sh

Limitations

  • Windows wheels not yet built — tracked for v0.2.
  • The CLI's output framerate is fixed at 25 fps to match the model's internal rate. Pass --output - and pipe to your own encoder if you need temporal resampling.
  • preferred_ep=COREML/NNAPI/QNN is accepted but currently no-ops to CPU in the v0.1 build.

License

Commercial. Contact hello@bithuman.ai.

See also

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

bithuman-2.2.6-cp314-cp314-manylinux_2_28_x86_64.whl (74.9 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.28+ x86-64

bithuman-2.2.6-cp314-cp314-manylinux_2_28_aarch64.whl (73.7 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.28+ ARM64

bithuman-2.2.6-cp314-cp314-macosx_26_0_arm64.whl (76.6 MB view details)

Uploaded CPython 3.14macOS 26.0+ ARM64

bithuman-2.2.6-cp313-cp313-manylinux_2_28_x86_64.whl (74.9 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

bithuman-2.2.6-cp313-cp313-manylinux_2_28_aarch64.whl (73.7 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ ARM64

bithuman-2.2.6-cp313-cp313-macosx_26_0_arm64.whl (76.6 MB view details)

Uploaded CPython 3.13macOS 26.0+ ARM64

bithuman-2.2.6-cp312-cp312-manylinux_2_28_x86_64.whl (74.9 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

bithuman-2.2.6-cp312-cp312-manylinux_2_28_aarch64.whl (73.7 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ ARM64

bithuman-2.2.6-cp312-cp312-macosx_26_0_arm64.whl (76.6 MB view details)

Uploaded CPython 3.12macOS 26.0+ ARM64

bithuman-2.2.6-cp311-cp311-manylinux_2_28_aarch64.whl (73.7 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ ARM64

bithuman-2.2.6-cp311-cp311-macosx_26_0_arm64.whl (76.6 MB view details)

Uploaded CPython 3.11macOS 26.0+ ARM64

bithuman-2.2.6-cp310-cp310-manylinux_2_28_x86_64.whl (74.9 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

bithuman-2.2.6-cp310-cp310-manylinux_2_28_aarch64.whl (73.7 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ ARM64

bithuman-2.2.6-cp310-cp310-macosx_26_0_arm64.whl (76.6 MB view details)

Uploaded CPython 3.10macOS 26.0+ ARM64

File details

Details for the file bithuman-2.2.6-cp314-cp314-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for bithuman-2.2.6-cp314-cp314-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 fcf550195dc175661496ec7ae5f721486796c5c96de1dc00ffe137081949e739
MD5 cfda6c74021c185430438f98bef6ddbe
BLAKE2b-256 6473e2a3b6840bdbd23113952b9d20f1c5491f28695ab1a6a3c545366932bc1c

See more details on using hashes here.

File details

Details for the file bithuman-2.2.6-cp314-cp314-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for bithuman-2.2.6-cp314-cp314-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 faf86b0afa7f4256bf5b7bd4a47934102f1bab54e114a1d0c4ed3af3a5001f7f
MD5 be5746a2c46af6f62ed1fc5e8765e164
BLAKE2b-256 8edca7b17eeae284bef5961e0efc6a746b20d65cbfe9a659ef2a8f60d973d84a

See more details on using hashes here.

File details

Details for the file bithuman-2.2.6-cp314-cp314-macosx_26_0_arm64.whl.

File metadata

File hashes

Hashes for bithuman-2.2.6-cp314-cp314-macosx_26_0_arm64.whl
Algorithm Hash digest
SHA256 31adbe1967e316fd89a34b53e283933d8ccadaa417691b1011260641c8a56cfa
MD5 335148700121222db217da6438fa6104
BLAKE2b-256 f94a6ed59b9e0af00046b61f570d3960c2c90835feb47367354f1c25eb72f150

See more details on using hashes here.

File details

Details for the file bithuman-2.2.6-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for bithuman-2.2.6-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 a1c5a8354186ddf74e296102ef46a96c71f9e51f0e527f86e0b74df98d2843fc
MD5 2791c05c64ea2e0439713b9ade9f7987
BLAKE2b-256 b8502677bd4d50c2e54cfdbff97e72db3dc5abe8187cf49c253703cee17391d0

See more details on using hashes here.

File details

Details for the file bithuman-2.2.6-cp313-cp313-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for bithuman-2.2.6-cp313-cp313-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 24d9e973dad04ebc1a302c5520943b7adcf024aabbb3fa93e91ba1deeb1eb201
MD5 cad5260f4a670e7d47662e4ec5f7f6c5
BLAKE2b-256 393cbaad2985849025395fd8df9e5488636b40a51e780cbe4e78f54c3cf3c27a

See more details on using hashes here.

File details

Details for the file bithuman-2.2.6-cp313-cp313-macosx_26_0_arm64.whl.

File metadata

File hashes

Hashes for bithuman-2.2.6-cp313-cp313-macosx_26_0_arm64.whl
Algorithm Hash digest
SHA256 5670463a75c79b819d913d9e65af7793b6bc8873a9c67e90f28930e97884a10e
MD5 134a40ad4ad8dc3f538ec34b7774f1a4
BLAKE2b-256 8a44f091315734d5814a28f491d9daca1cf35ba75a4491b0cc551739bb04a9a7

See more details on using hashes here.

File details

Details for the file bithuman-2.2.6-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for bithuman-2.2.6-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 73137f3930e38cae849c10348c342a7ec0b9614a429679029fc3d35c9899ae1f
MD5 7bbe35e8fd57f6e6bf80b6d137a499cf
BLAKE2b-256 251efe7213ef300c581915dfb23d3a760eaf1ab6bee5155e2f57b9a96538ab7b

See more details on using hashes here.

File details

Details for the file bithuman-2.2.6-cp312-cp312-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for bithuman-2.2.6-cp312-cp312-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 1995b9c443d16bcef2bc5e8b7918266fca4a019714014bb8eb476fce4c705712
MD5 ee8c1509af37b5360ed4f83427ecad5f
BLAKE2b-256 c39f9f221d2057e6f844507679bce1ba4adf13d1247e178b21fba04243fd646e

See more details on using hashes here.

File details

Details for the file bithuman-2.2.6-cp312-cp312-macosx_26_0_arm64.whl.

File metadata

File hashes

Hashes for bithuman-2.2.6-cp312-cp312-macosx_26_0_arm64.whl
Algorithm Hash digest
SHA256 ac6ac3d04bc52e3178e0eb260901d3c828783a39a378ab39e18a804cf15ca2b7
MD5 c224e7ec01521e98682d98fa8417ab47
BLAKE2b-256 e8d090b3cf39e3d261c361563e762dbaf22bfd5adbf1236012ce0bd964df5823

See more details on using hashes here.

File details

Details for the file bithuman-2.2.6-cp311-cp311-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for bithuman-2.2.6-cp311-cp311-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 39db084253f947c8aeee9d5010a12c4cd86b20cef3ea4b4b29f975f53d7e21fa
MD5 a27dfe76334bf1c195152c4e7bf5a71b
BLAKE2b-256 16b074fca88ac4cd9b42a72dbcb393d7d0cc912b3bafdf9fd12dba2b55d9636c

See more details on using hashes here.

File details

Details for the file bithuman-2.2.6-cp311-cp311-macosx_26_0_arm64.whl.

File metadata

File hashes

Hashes for bithuman-2.2.6-cp311-cp311-macosx_26_0_arm64.whl
Algorithm Hash digest
SHA256 8d27264ae95d80c375914e9b10ee1ae312c2208dc6c12277a40ba8a123a1c86e
MD5 262f3cb1880ab23d7084c39cef9f079d
BLAKE2b-256 20a03e1ec0bf3af6d25e25271a25ebd1e802b7a69fe424ee0b5803b060095a1c

See more details on using hashes here.

File details

Details for the file bithuman-2.2.6-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for bithuman-2.2.6-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 8efead92d10ab2a1e2e0f713ad23545a2f2084157f652513fc29723abeb1e5ac
MD5 12ecca13e1e32dcbe22038a65ed8760d
BLAKE2b-256 22046af7f579c763319d1971a013c4a51cada964ad019ce9eb95651d3399fe72

See more details on using hashes here.

File details

Details for the file bithuman-2.2.6-cp310-cp310-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for bithuman-2.2.6-cp310-cp310-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 90d0878a2f9145ecbe829fc5bb3888074146e29e13ae4fd5a473745b1d04e612
MD5 fc17aed15761b85d052ef8704c8f31bc
BLAKE2b-256 1c5cb4e0cce5d60bd8b7755e9fa206f9d564e5c152939ff9b403ef2e341f36ff

See more details on using hashes here.

File details

Details for the file bithuman-2.2.6-cp310-cp310-macosx_26_0_arm64.whl.

File metadata

File hashes

Hashes for bithuman-2.2.6-cp310-cp310-macosx_26_0_arm64.whl
Algorithm Hash digest
SHA256 a48e79c44825e1f1025b2556e8ddd366ac9582b6d008fad8e72b4fcb43c61ca4
MD5 441c1d71b19a60d22222edb4e0a47c0e
BLAKE2b-256 36ad535eb6c837600d03979038450faa13bb512010e65148acd08b416ec1bdb9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page