Skip to main content

bitHuman Python SDK — libessence-backed avatar runtime. `from bithuman import AsyncBithuman`.

Project description

bithuman

This is the Python flavor of Layer 3: a platform-specific library for app developers. It wraps the Layer 1 libessence engine. For the CLI tool see docs/CLI.md.

┌─────────────────────────────────────────────────────────────┐
│ Layer 3: Platform-specific libraries (app developers)       │
│   - Python wheel       pip install bithuman    ◄──── you are here
│   - Swift package      SwiftPM Bithuman                     │
│   - Kotlin AAR         ai.bithuman:sdk                      │
│   - (future) Rust crate, JS/TS, Go, ...                     │
└─────────────────────────────────────────────────────────────┘
                          ▼ embeds
┌─────────────────────────────────────────────────────────────┐
│ Layer 2: bithuman CLI (end-user tool)                       │
│   - one cross-platform binary on macOS / Linux / Windows    │
│   - brew install bithuman · curl-pipe installer             │
└─────────────────────────────────────────────────────────────┘
                          ▼ links
┌─────────────────────────────────────────────────────────────┐
│ Layer 1: libessence engine (cross-platform C++ core)        │
│   - portable C ABI, same source on every target             │
│   - macOS · iOS · Android · Linux · Windows                 │
│   - never imported directly by app developers               │
└─────────────────────────────────────────────────────────────┘

Two surfaces

pip install bithuman is both a CLI and a Python library. They share the same libessence native engine but otherwise operate independently — the CLI is a bundled Rust binary at <pkg>/_bin/bithuman and does not import the Python lib API; lib users never invoke the CLI.

# CLI — talk to an avatar in your browser
bithuman run model.imx
# Library — embed the runtime in your own app
from bithuman import AsyncBithuman
runtime = await AsyncBithuman(model_path="model.imx", api_secret="...").start()

Python bindings for the bitHuman SDK — the portable C++ avatar engine (libessence) that powers our cross-platform lipsync pipeline. The wheel ships a native pybind11 module that talks directly to libessence, so you get the same per-frame cost as our Swift and Kotlin clients with none of the GIL noise.

On an Apple M5 with 24 GB unified memory we measure ~640 FPS sustained compose (1.56 ms/frame mean, 2.03 ms p99) for a 1248×704 avatar, with ~206 MB peak RSS end-to-end. Cold load is ~14 ms for the fixture and ~400 ms for the first compose tick (lazy ONNX init).

This package is namespace-isolated from the v0 bithuman SDK; you can install both side-by-side.

Install

pip install bithuman

Status. The PyPI bithuman wheel is at v2.2.6 (2026-05-27) shipping the bundled Rust CLI + conversation brain. bithuman run is the fast-path live-avatar command; bithuman[local] adds a fully on-device brain (whisper.cpp + llama.cpp + Supertonic). The legacy low-level streaming API (AsyncBithuman.push_audio + async for ... in runtime.run()) is still exported for library users — see the legacy quickstart.

Brain modes

bithuman run avatar.imx needs a conversational brain. There are two paths:

Mode Install Env When
Cloud (default) pip install bithuman OPENAI_API_KEY=sk-... Fastest setup, best quality (OpenAI Realtime).
On-device pip install 'bithuman[local]' BITHUMAN_LOCAL=1 Zero outbound network, no API keys (whisper.cpp + llama.cpp + Supertonic).

Setting BITHUMAN_LOCAL=1 takes precedence — the cloud key is ignored when local mode is active. Run bithuman doctor to see which modes are available on your machine.

Compatibility

  • Platforms: macOS arm64, Linux x86_64, Linux arm64 — all ship as wheels. Windows is tracked for a follow-up.
  • Python: 3.10 – 3.13 (cp310, cp311, cp312, cp313). CPython only.
  • ABI: the published wheel wraps libessence ABI v4. The libessence engine itself is on ABI v6 — that surface is currently exposed via the Swift / Kotlin / Rust bindings only. PyO3 wheel migration in flight.
  • Auth: ships with live heartbeat against api.bithuman.ai baked into libessence. Avatar.load(api_secret=...) is the entry point; BITHUMAN_API_SECRET env var works too. Set BITHUMAN_UNMETERED=1 for dev / parity-test runs.

What you get

The package exposes three API tiers (all importable from bithuman):

Tier Types Use when…
Async AsyncAvatar, AudioChunk, VideoControl, VideoFrame Hosting a service / parity with legacy AsyncBithuman
Sync facade Avatar, ComposedFrame, EP Offline / batch / CLI rendering
Low-level Fixture, Runtime, EP_CPU/EP_AUTO/EP_COREML/EP_NNAPI/EP_QNN Direct C ABI access, custom audio pipeline

Error types: BithumanError (base), TokenError / TokenExpiredError / TokenValidationError / TokenRequestError / AccountStatusError (auth), ModelError / ModelNotFoundError / ModelLoadError / ModelSecurityError / ExpressionModelNotSupported (fixture), RuntimeNotReadyError.

Version info: bithuman.__version__ (Python package), bithuman.__core_version__ (linked libessence), bithuman.__abi_version__.

Quickstart (legacy AsyncBithuman — PyPI)

This is the shape of the current published wheel. It ports directly from the v0 bithuman SDK: feed PCM with push_audio, drain frames from the runtime.run() async generator.

import asyncio
from bithuman import AsyncBithuman

async def main():
    runtime = await AsyncBithuman(
        model_path="model.imx",
        api_secret="...",  # or BITHUMAN_API_SECRET env var
    ).start()

    await runtime.push_audio(pcm_16k_mono_int16_bytes,
                             sample_rate=16000, last_chunk=True)

    async for frame in runtime.run():
        # frame.bgr_image is (H, W, 3) uint8 in BGR order
        ...

asyncio.run(main())

PCM accepted is int16 little-endian bytes at 16 kHz mono. WAV / MP3 / FLAC / OGG decoding is the caller's responsibility (use soundfile).

Quickstart (low-level, C-level streaming surface)

The Rust PyO3 wheel will expose the ABI-v6 streaming pair (runtime.push_audio + runtime.pull_frame) on the same shape as the Swift / Kotlin bindings. Until it ships to PyPI, the snippet below uses the legacy Fixture / Runtime types in the published wheel.

Live avatar — bithuman run (browser + voice chat)

The wheel also ships the bithuman Rust CLI plus an embedded conversation brain (a livekit-agents worker). bithuman run avatar.imx stands up the whole stack — embedded livekit-server, libessence runtime, brain, browser player — and prints a URL you open to talk to the avatar.

pip install bithuman
export BITHUMAN_API_KEY=...     # avatar-runtime auth (https://www.bithuman.ai/#developer)
export OPENAI_API_KEY=sk-...    # the conversational brain (OpenAI Realtime)

bithuman run ~/.cache/bithuman/models/sample-avatar.imx
# → open the printed http://127.0.0.1:8088/<CODE> URL in a browser

Fully on-device — bithuman[local]

For zero-cloud operation, install the [local] extra and set BITHUMAN_LOCAL=1. No OpenAI key, no outbound network — the brain swaps to whisper.cpp (STT) + llama.cpp (LLM) + Supertonic (TTS), all in-process, all auto-downloaded from HuggingFace on first run.

pip install 'bithuman[local]'

export BITHUMAN_API_KEY=...
BITHUMAN_LOCAL=1 bithuman run ~/.cache/bithuman/models/sample-avatar.imx
Slot Library (mobile-portable C++ core) Default model Disk RAM
STT pywhispercpp → whisper.cpp tiny.en 77 MB ~150 MB
LLM llama-cpp-python → llama.cpp Qwen 2.5 0.5B-Instruct Q4_K_M 400 MB ~600 MB
TTS supertonic → ONNX Runtime Supertonic 3 (voice M1, 31 languages) 380 MB ~600 MB
VAD livekit-plugins-silero Silero 5 MB ~50 MB

Total ~860 MB on disk, ~1.5 GB RAM, ~717 ms warm load, ~1.4 s warm end-to-end (STT + LLM + TTS) on Apple Silicon. Cold start adds ~90 s once for first-run model downloads.

Optional knobs (env vars)

Var Default What
BITHUMAN_LOCAL unset =1 flips the brain to the local stack.
BITHUMAN_LOCAL_WHISPER tiny.en whisper.cpp model size (tiny.en / base.en / small / large-v3-turbo).
BITHUMAN_LOCAL_LLM Qwen/Qwen2.5-0.5B-Instruct-GGUF HuggingFace repo id of a GGUF LLM.
BITHUMAN_LOCAL_LLM_FILE qwen2.5-0.5b-instruct-q4_k_m.gguf GGUF file within the repo.
BITHUMAN_LOCAL_VOICE M1 Supertonic voice preset (M1M5 / F1F5).
BITHUMAN_LOCAL_LANG en Supertonic language (31 supported: en, ko, ja, es, de, …).
BITHUMAN_INSTRUCTIONS short default Override the system prompt.

All three local backends have first-party iOS/Android C++ builds, so the same .gguf / .bin / .onnx model files are reusable when porting to mobile — see sdks/python/src/bithuman/local_plugins/.

CLI

pip install bithuman ships a bithuman console script — the Rust CLI (bundled at <wheel>/bithuman/_bin/bithuman) is the supported surface.

pip install bithuman

bithuman run avatar.imx        # live avatar (browser-to-talk)
bithuman render avatar.imx -a speech.wav -o out.mp4
bithuman info  avatar.imx
bithuman list / pull / doctor

See bithuman --help for full flags. Same Essence engine the lib uses; two surfaces, one libessence.

Low-level API

If you need finer control or want to swap in a custom audio pipeline, the C ABI is exposed directly:

import numpy as np
from bithuman import Fixture, Runtime, EP_CPU

fx = Fixture("model.imx", preferred_ep=EP_CPU, intra_op_threads=1)
rt = Runtime(fx)
pcm = np.fromfile("speech.f32", dtype=np.float32)  # 16 kHz mono float32
cluster_idx, bgr = rt.tick_compose(pcm, frame_idx_hint=-1)
# bgr.shape == (fx.frame_height, fx.frame_width, 3), dtype uint8

Pass the entire pcm buffer to each tick_compose call; the runtime maintains an internal cursor and advances one tick per call until the audio is exhausted.

Zero-alloc hot path (since 1.12.4)

For tight render loops, pre-allocate the BGR buffer once and pass it via out=. The runtime writes into it in place and returns just the cluster_idx. This drops wrapper overhead to within ~3 % of raw libessence (vs ~8 % for the alloc-per-tick path):

out = np.empty((fx.frame_height, fx.frame_width, 3), dtype=np.uint8)
for _ in range(num_ticks):
    cluster_idx = rt.tick_compose(pcm, -1, out=out)
    # `out` now holds this tick's frame; read it before the next call.

The same out= keyword works on tick_compose_to_size. See docs/ARCHITECTURE.md §9 for the cross-wrapper perf table.

Build from source

You need the prebuilt parent C++ archive at engine/build/libessence.a (run the parent CMake build first), plus the runtime deps from Homebrew (onnxruntime, webp, ffmpeg, hdf5, jpeg-turbo).

cd sdks/python
uv pip install -e '.[cli,test]' --no-build-isolation

The CMake glue links the prebuilt static archive directly — it does NOT re-run the parent build, so iterate on bindings without paying the C++ rebuild cost.

Performance

Measured with tests/bench.py against the v1 compose path (audio → composited BGR frame) on Apple M5 24 GB, libessence 1.16.0:

Metric Alloc per tick out= reuse buffer
Steady-state mean 1.53 ms / frame 1.45 ms / frame
p99 1.66 ms 1.53 ms
Sustained throughput 655 FPS 692 FPS
Overhead vs raw libessence +8.3 % +2.6 %
Peak RSS (proc) 192 MB 182 MB

Wrapper overhead is within 5 % of raw libessence on the out= path; see docs/ARCHITECTURE.md §9 for the apples-to-apples methodology and the cross-wrapper comparison. Reproduce with:

scripts/bench-wrappers.sh

Linux wheels

Pre-built manylinux_2_28 wheels ship for x86_64 + aarch64 across cp310 through cp313 — 8 wheels in total, all auditwheel-repaired with the full dep tree bundled (ORT, FFmpeg, HDF5, libjpeg-turbo, libwebp, libcurl, OpenSSL).

To rebuild them locally:

# One-time: build the dep-baked Docker images (~10 min each).
docker build --platform linux/amd64 -t libessence/manylinux-x86_64:0.1 \
    -f scripts/Dockerfile.manylinux-x86_64 scripts/
docker build --platform linux/arm64/v8 -t libessence/manylinux-aarch64:0.1 \
    -f scripts/Dockerfile.manylinux-aarch64 scripts/

# Per wheel build (~2 min):
docker run --rm --platform linux/amd64 -v "$REPO":/src \
    -e PYTAG=cp311 -e ARCH_INSIDE=x86_64 \
    libessence/manylinux-x86_64:0.1 \
    bash /src/sdks/python/scripts/build-wheel-in-container.sh

Limitations

  • Windows wheels not yet built — tracked for v0.2.
  • The CLI's output framerate is fixed at 25 fps to match the model's internal rate. Pass --output - and pipe to your own encoder if you need temporal resampling.
  • preferred_ep=COREML/NNAPI/QNN is accepted but currently no-ops to CPU in the v0.1 build.

License

Commercial. Contact hello@bithuman.ai.

See also

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

bithuman-2.3.0-cp314-cp314-manylinux_2_28_x86_64.whl (18.0 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.28+ x86-64

bithuman-2.3.0-cp314-cp314-manylinux_2_28_aarch64.whl (16.9 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.28+ ARM64

bithuman-2.3.0-cp314-cp314-macosx_26_0_arm64.whl (3.8 MB view details)

Uploaded CPython 3.14macOS 26.0+ ARM64

bithuman-2.3.0-cp313-cp313-manylinux_2_28_x86_64.whl (18.0 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

bithuman-2.3.0-cp313-cp313-manylinux_2_28_aarch64.whl (16.9 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ ARM64

bithuman-2.3.0-cp313-cp313-macosx_26_0_arm64.whl (3.6 MB view details)

Uploaded CPython 3.13macOS 26.0+ ARM64

bithuman-2.3.0-cp311-cp311-manylinux_2_28_x86_64.whl (18.0 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

bithuman-2.3.0-cp311-cp311-manylinux_2_28_aarch64.whl (16.9 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ ARM64

bithuman-2.3.0-cp311-cp311-macosx_26_0_arm64.whl (3.8 MB view details)

Uploaded CPython 3.11macOS 26.0+ ARM64

File details

Details for the file bithuman-2.3.0-cp314-cp314-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for bithuman-2.3.0-cp314-cp314-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 ae1e860fef50c21becd12ad2801c69dcf045d3e5f82965725d21e9d289cc46f6
MD5 ebc432f2f294e4f1a24050f6a9dace8e
BLAKE2b-256 aeeef233e1d9426baf2bd08a61e054b7a3314913bec99c6ea48cdd816b0cd0e8

See more details on using hashes here.

File details

Details for the file bithuman-2.3.0-cp314-cp314-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for bithuman-2.3.0-cp314-cp314-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 7011919b7e439a2848545e5b354e65d6d78421886c02db539f0f87780e6cd059
MD5 e2985d762ecb2509f06a30018a882767
BLAKE2b-256 d1363bf7db1d9ac8017985aabc5c4382a6f8e171ee5df99a307eaf00e30a59b9

See more details on using hashes here.

File details

Details for the file bithuman-2.3.0-cp314-cp314-macosx_26_0_arm64.whl.

File metadata

File hashes

Hashes for bithuman-2.3.0-cp314-cp314-macosx_26_0_arm64.whl
Algorithm Hash digest
SHA256 6c8ce0345522dde4410d05353b67325310e47d5e5d56a13b3af91c33dfb72ecb
MD5 1cbd16898fe272034afc29dd58f27da5
BLAKE2b-256 e9d9927f77ef37ed547b751d19f61dc045cc70212827b8a8044f5c8c14ab6afd

See more details on using hashes here.

File details

Details for the file bithuman-2.3.0-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for bithuman-2.3.0-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 d65b78bfb127fd85c8227b556c19541af5517556df92f5b0b81cf7f1323cae94
MD5 97fb1db01acaea654d1831a1a6fac27a
BLAKE2b-256 3679d0adf6fad972a0633b4db4cc7a06b540e60f02cf875440703f1f400947fa

See more details on using hashes here.

File details

Details for the file bithuman-2.3.0-cp313-cp313-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for bithuman-2.3.0-cp313-cp313-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 9d94f68ded9e023843783bc15c4771ecb5e22e9e1ea82d339cf6affd538c835c
MD5 f0abf21dec98a6d69e253a5ca77e9ac6
BLAKE2b-256 e37e68df3d222ce5f6db94cc40c6ed42a3a3a9bdb6145cec5694bdd641d96c21

See more details on using hashes here.

File details

Details for the file bithuman-2.3.0-cp313-cp313-macosx_26_0_arm64.whl.

File metadata

File hashes

Hashes for bithuman-2.3.0-cp313-cp313-macosx_26_0_arm64.whl
Algorithm Hash digest
SHA256 ca8176444f4c37292f1f50efcfd470aabe66420c1bd93e96d7b758984f085d58
MD5 61f24f88a20cca7924d28bd2567f9362
BLAKE2b-256 18f607c061bf86008f4daf2833be114419cd79e97e047319140bdcd445e8d9bd

See more details on using hashes here.

File details

Details for the file bithuman-2.3.0-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for bithuman-2.3.0-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 c8bf307987a9d61300c2ec27ea47c9d9764acad6f3eb18dbaac9e090265c7ff8
MD5 6109c4e02bd0d049d3070883bea04615
BLAKE2b-256 cc07440d283618516f18757a7475de982d3e692216c9277142ae9e9c0ce93e29

See more details on using hashes here.

File details

Details for the file bithuman-2.3.0-cp311-cp311-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for bithuman-2.3.0-cp311-cp311-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 c883799682619964fa8d6d2d4eafd3cf607719582b22a2bbffcde452811ab92e
MD5 794a63479ab53e75d906db245f9ed2b7
BLAKE2b-256 e26f22b5606457834e3344445c468750c0afcd7dca39fb3347c6d5e84cf6420c

See more details on using hashes here.

File details

Details for the file bithuman-2.3.0-cp311-cp311-macosx_26_0_arm64.whl.

File metadata

File hashes

Hashes for bithuman-2.3.0-cp311-cp311-macosx_26_0_arm64.whl
Algorithm Hash digest
SHA256 dd3448e710e834bf4f477b1fa94750b02c71d51c06070238308cd4c5a3ddbde7
MD5 26e9c116766a98fe7be116c4008e0325
BLAKE2b-256 544e6361a5554c3bb8b702c65ebe59000ded4f5404bc351b000b1f64eebcd3d6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page