Python SDK for the Arbez open-source barcode + QR detector.
Project description
arbez — multi-engine barcode & QR scanner for Python
High-yield barcode & QR detection in Python that stays simple — one pip install, one Scanner(), every platform.
arbez reads barcodes and QR codes from real-world images and, in our benchmark, returns more of them than any single engine it runs. It pairs a bundled AI detector with zxing-cpp and, on macOS, Apple's on-device Vision framework, then merges the results — no tuning, no system libraries, one Scanner() call. Jump to the quick start.
What it does
- Highest yield, out of the box.
Scanner()merges every engine's reads, surfacing codes no single engine finds on its own. On a 4,290-image real-world corpus it decoded 5,014 distinct codes and found at least one in 98% of images — more than any single engine (see the numbers ↓). - One install, every platform.
pip install arbezworks on macOS, Linux, and Windows (Python 3.10 – 3.14) with no system libraries to set up — the engines, the bundled model, and the common image formats come in the wheels (HEIC / AVIF and WeChat are optional extras). Then it's oneScanner()call. - Broad symbology coverage. QR, Micro QR, Data Matrix, Aztec, PDF417, Code 128, Code 39, Code 93, EAN-13/8, UPC-A/E, GS1 DataBar, and a 1D catch-all out of the box — with ITF, Codabar and more surfaced by the zxing engine (a core dependency).
- Decoder-accurate symbology labels. When a code decodes, its label is the decoder's ECC-validated format, not the detector's guess — so a decoded Data Matrix reads as Data Matrix, never "QR". (When nothing decodes, the detector's class is kept.)
- Reads almost any input. File paths, raw
bytes,PIL.Image, NumPy arrays, or file-like streams; JPEG / PNG / WebP / TIFF / BMP / GIF built in, HEIC / AVIF with an extra. - Precision when you need it.
Scanner(consensus=2)keeps only codes that ≥ 2 engines agree on — evaluated per detected code, not per image. - Bring your own model. Swap the bundled detector for your own YOLOX-s / RT-DETR / YOLO11 ONNX, or run several at once. See bring-your-own-weights.
- Typed and tested. Ships type hints (
py.typed, mypy-checked), 500+ tests in CI, Apache-2.0, Python 3.10 – 3.14.
| Engine | Platform | Strengths | Install |
|---|---|---|---|
arbez |
all | neural detector — QR / Data Matrix / 2D codes | default (bundled model) |
zxing |
all | broad classical 1D + 2D decode | default (core dependency) |
apple_vision |
macOS | fast; leads 1D linear codes in our macOS benchmark | default on macOS |
wechat |
all | independent QR detector (corroboration) | pip install 'arbez[wechat]' |
How the engines combine: docs/concepts.md · full API: docs/api-reference.md.
Built on proven engines
arbez builds on zxing-cpp for classical 1D/2D decode, with libdmtx (via the arbez-dmtx package) as a Data Matrix fallback; on macOS it adds Apple's on-device Vision framework — a real lift to accuracy and speed — and the optional WeChat detector contributes an independent QR read. Its own bundled AI detector ties them together; the Roadmap is for that detector to carry more of the load over time. Prefer a single engine? Scanner(engine="zxing").
Install
pip install arbez # core: arbez + zxing engines; JPEG/PNG/WebP/TIFF/BMP/GIF
pip install 'arbez[wechat]' # + WeChat QR engine
pip install 'arbez[heic]' # + HEIC (iPhone photos)
pip install 'arbez[avif]' # + AVIF
pip install 'arbez[all]' # everything above
On macOS, pip install arbez auto-pulls the Apple Vision dependencies — the apple_vision engine works with no extra. On Linux / Windows those deps are excluded by platform marker. Full matrix: docs/installation.md.
Quick start
from arbez import Scanner
with Scanner() as scanner: # union of all installed engines
result = scanner.scan("photo.jpg") # path, bytes, PIL.Image, ndarray, or stream
for d in result.detections:
print(d.symbology, d.payload, d.bbox_xyxy)
# -> Symbology.QR https://arbez.org (40.0, 40.0, 290.0, 290.0)
Narrow the engine set, or require agreement:
Scanner() # union of all installed engines (default, max yield)
Scanner(engine="zxing") # a single engine
Scanner(engines=["arbez", "zxing"]) # union over a chosen subset
Scanner(consensus=2) # keep only codes >= 2 engines agree on (precision)
res = Scanner().scan(image_bytes)
res.detections # merged, per-code union — each detection carries extras["voted_by"]
res.per_engine["zxing"] # any engine's own raw detections, for inspection
consensus is an integer: the default 1 is "union" (keep anything any engine saw); consensus=N keeps only codes at least N engines agree on, clustered per detected code. More: docs/getting-started.md · docs/how-to.md · docs/consensus-rules.md.
Speed vs. yield. The default runs every engine, trading latency for coverage; per-stage wall-clock is on every result.timings_ms. For a lighter, lower-latency path, pin a single engine — Scanner(engine="zxing") anywhere, or Scanner(engine="apple_vision") on macOS (hardware-accelerated). See docs/profiling.md.
Benchmarks
Across 4,290 images — 4,276 real-world natural-scene photographs plus 14 synthesized format probes — the default Scanner() (every installed engine, results unioned) decoded 5,014 distinct codes and found at least one in 98% of images: more than any single engine. A snapshot on one private corpus (macOS, all four engines) with default settings; these are decode-yield counts, not a human-labeled ground truth (see Limitations), and not a universal ranking.
Yield by configuration
Distinct codes decoded over 4,290 images (4,276 corpus + 14 synthesized exotic-format). Scanner() is the 0.2.0 default — the union of all installed engines; consensus=N keeps only codes that ≥ N engines agree on (per detected code).
| Configuration | Images with ≥1 code | Distinct codes decoded |
|---|---|---|
Scanner() — all engines, union |
4,224 (98%) | 5,014 |
engine="apple_vision" (macOS-only) |
4,188 | 4,932 |
engine="zxing" |
3,661 | 3,956 |
engine="arbez" (bundled detector) |
3,284 | 3,480 |
engine="wechat" (QR-only) |
2,226 | 2,084 |
consensus=2 (≥2 engines agree) |
3,746 | 4,197 |
consensus=3 (≥3 engines agree) |
3,043 | 3,093 |
Scanner() leads every configuration here — the union recovers codes any single engine alone misses (+82 distinct beyond the strongest macOS engine; likely more on Linux/Windows, where Apple Vision isn't available, though that isn't benchmarked here). consensus=2/=3 trade yield for cross-engine agreement (precision).
By symbology
Distinct codes decoded by symbology — each engine's own decoded codes, by their (decoder-accurate) symbology. Scanner() ≥ every engine on each row. Different engines lead different symbologies — which is exactly why Scanner() unions them:
| Symbology | arbez | Apple Vision | ZXing | Scanner() |
|
|---|---|---|---|---|---|
| QR | 2,355 | 2,357 | 2,357 | 2,084 | 2,385 |
| Code 128 | 635 | 1,564 | 996 | – | 1,583 |
| Data Matrix | 322 | 505 | 254 | – | 517 |
| Code 39 | 61 | 156 | 121 | – | 163 |
| ITF | 17 | 154 | 100 | – | 156 |
| PDF417 | 44 | 85 | 54 | – | 92 |
| EAN-13 | 33 | 81 | 50 | – | 81 |
| Aztec | 10 | 14 | 10 | – | 14 |
| Exclusive to engine ¹ | 13 | 514 | 30 | 0 | — |
Bold (engine columns) = best single engine for that symbology. Symbology is decoder-accurate: arbez is both a detector and a decoder, and since v0.2.0 (S-094) it adopts the decoder's ECC-validated format as the label — so codes its detector had filed as "QR" but are really Data Matrix / ITF / Aztec are now labeled correctly. On this macOS host Apple Vision leads or ties most symbologies, but it is macOS-only; arbez and ZXing are the always-present cross-platform pair (bare Scanner() adds Apple Vision automatically on macOS). The headline is the union: Scanner() meets or beats every single engine on every symbology.
¹ Exclusive to engine = distinct codes whose merged cluster in the Scanner() result was agreed by only that engine (its extras["voted_by"] tuple names just that one engine) — i.e. what the union would lose if you dropped it. Of the 5,014 union codes, 557 are single-engine and 4,457 are corroborated by ≥2 engines. WeChat's 0 is honest: every QR it read, another engine read too (it is QR-only, and the others are already strong on QR), so it earns its slot on agreement/precision rather than unique yield. Apple Vision's 514 is mostly the 1D linear family on this macOS host; on Linux/Windows, where it is unavailable, the always-present cross-platform pair (arbez + ZXing) carries the union. (This counts a physical code once via spatial clustering, so it does not double-count the same code read with slightly different bytes by two engines — a raw payload-hash basis would inflate every engine's "exclusive" count roughly 2×.)
Methodology
- Corpus: 4,276 real-world natural-scene photographs spanning 1D and 2D symbologies under varied lighting, angle, focus, and clutter — plus 14 synthesized images that exercise the exotic input formats (HEIC, AVIF, WebP, BMP, TIFF, GIF) end to end.
- Configurations: each engine alone,
Scanner()(all installed engines, union), andconsensus=2/consensus=3. All seven derive from a single scan per image —Scanner()runs every engine once,Result.per_engineexposes each engine's own detections, and the consensus thresholds re-vote those cached detections — so the configurations are exactly comparable. - Metric: distinct codes decoded = distinct decoded payloads (deduplicated by hash). "Images with ≥1 code" = images where the configuration decoded at least one payload. These are decode-yield counts.
- Environment: a fresh
pip install arbez[all], Python 3.12, Apple Silicon (macOS arm64), arbez 0.2.0. Date: 2026-06-17 — every number above comes from one consistent corpus pass.
Reproduce
The private corpus isn't shipped, but the pipeline is one Scanner() pass per image. Generate a runnable, self-contained synthetic corpus with arbez.testing.clean_corpus() (needs the [dev] extra, or pip install qrcode python-barcode):
from pathlib import Path
from arbez import Scanner
from arbez.testing import clean_corpus
out = Path("synthetic-corpus"); out.mkdir(exist_ok=True)
for spec in clean_corpus():
spec.image.save(out / f"{spec.spec_id}.png")
scanner = Scanner()
for img in out.iterdir():
res = scanner.scan(img)
res.detections # merged union (each with extras["voted_by"])
res.per_engine["zxing"] # that engine's own detections
Limitations
- A capability snapshot, not a competitor ranking. Results are specific to this corpus, hardware, and default configuration; your mileage will vary with image mix, resolution, and tuning.
- Decode yield, not ground truth. "Distinct codes decoded" counts what each configuration read; it isn't checked against a human-labeled key, so a misread inflates a count. The
consensus=2/=3rows are the precision view. - Apple Vision is macOS-only; on Linux/Windows
Scanner()unions arbez + ZXing (+ WeChat if installed), where the relative gain from unioning is larger. - Engine independence is partial. arbez and ZXing share the zxing-cpp decoder, so agreement between only those two corroborates detection but not decoder independence (Apple Vision and WeChat are independent implementations).
- WeChat is QR-only and heavy-tailed in latency, so it's included for QR corroboration, not general coverage.
Documentation
Getting started · How-to · Concepts · API reference · Consensus rules · Bring your own weights · Installation · Profiling · Troubleshooting
Roadmap
Today arbez reaches its yield by combining engines. The direction is to grow the bundled AI detector until it clears that bar on its own — so that a single pip install arbez, with no platform-specific helpers, is all anyone needs for reliable detection on every platform. Each release trains it further; the multi-engine union is both today's product and the target the detector is closing in on. Apple Vision and the classical engines stay welcome boosts where available — the goal is simply that you never have to depend on them.
The aim is plain: make arbez the easiest, most reliable way to read QR codes and barcodes in Python — no expertise, no setup, just results.
Versioning
arbez follows semantic versioning's 0.x convention: while it is pre-1.0, the public API may change between minor releases, and breaking changes are documented in the CHANGELOG. 1.0.0 will mark the point at which the API surface is stable.
License
Apache License, Version 2.0 — see LICENSE. The bundled object-detection model (src/arbez/_assets/arbez_yolox_s.onnx) is also Apache-2.0; see src/arbez/_assets/NOTICE for full attribution.
Data Matrix fallback decoding uses arbez-dmtx (Apache-2.0), a core dependency that bundles the native libdmtx library (BSD-2-Clause, © 2005–2016 Mike Laughton, Vadim A. Misbakh-Soloviov and others) and a vendored pylibdmtx ctypes wrapper (MIT, © 2016–2020 Lawrence Hudson); their full license texts ship inside the arbez-dmtx wheel.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file arbez-0.2.0.tar.gz.
File metadata
- Download URL: arbez-0.2.0.tar.gz
- Upload date:
- Size: 33.4 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
49e0f4ca548dbed217e46136cee906f2abe24c568ae6df49d0c0a14cfd338f89
|
|
| MD5 |
7b03c1ebe7959586f8b0ded99a6b93a9
|
|
| BLAKE2b-256 |
0c7c7f2bbc02b9b8912e2567688bcd50b0d23dfcd884955b1ba4716a8d1e4d1b
|
Provenance
The following attestation bundles were made for arbez-0.2.0.tar.gz:
Publisher:
release.yml on arbez-org/arbez-sdk-python
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
arbez-0.2.0.tar.gz -
Subject digest:
49e0f4ca548dbed217e46136cee906f2abe24c568ae6df49d0c0a14cfd338f89 - Sigstore transparency entry: 1859100135
- Sigstore integration time:
-
Permalink:
arbez-org/arbez-sdk-python@b20c8b7c73c46703f9c2b4674476df3a57463ca2 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/arbez-org
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@b20c8b7c73c46703f9c2b4674476df3a57463ca2 -
Trigger Event:
push
-
Statement type:
File details
Details for the file arbez-0.2.0-py3-none-any.whl.
File metadata
- Download URL: arbez-0.2.0-py3-none-any.whl
- Upload date:
- Size: 33.4 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
02e4ecac8258edc63908ae02d31faf1088b97abe1672348fb8a9044182146883
|
|
| MD5 |
98d7003b1799bb3466307864a0af061f
|
|
| BLAKE2b-256 |
9b12be0128ac284ad5940e4ba11548d558ae534ef03a0de0b113cb99bcbd8be2
|
Provenance
The following attestation bundles were made for arbez-0.2.0-py3-none-any.whl:
Publisher:
release.yml on arbez-org/arbez-sdk-python
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
arbez-0.2.0-py3-none-any.whl -
Subject digest:
02e4ecac8258edc63908ae02d31faf1088b97abe1672348fb8a9044182146883 - Sigstore transparency entry: 1859100200
- Sigstore integration time:
-
Permalink:
arbez-org/arbez-sdk-python@b20c8b7c73c46703f9c2b4674476df3a57463ca2 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/arbez-org
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@b20c8b7c73c46703f9c2b4674476df3a57463ca2 -
Trigger Event:
push
-
Statement type: