Python bindings for the ARA-2 neural accelerator client library

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

au-zone

These details have not been verified by PyPI

Project links

Homepage

Project description

Python Bindings for ARA-2

Python bindings for the ARA-2 neural network accelerator client library, providing efficient NPU inference from Python via a proxy service running on NXP i.MX platforms with Kinara ARA-2 hardware.

Published to PyPI as edgefirst-ara2.

Architecture

Python Application ──(UNIX/TCP socket)──▶ ara2-proxy ──(PCIe)──▶ ARA-2 NPU
       │                                (system service)        (Kinara hardware)
       │
edgefirst-hal ──(DMA-BUF fd)──▶ GPU preprocessing (zero-copy)

Your Python code connects to the dvproxy system service (not directly to the hardware). The proxy manages device access and must be running before your application starts. The systemd unit name is platform-dependent: ara2.service on EdgeFirst Yocto images, dvproxy.service on other platforms.

Installation

From PyPI

pip install edgefirst-ara2

For zero-copy preprocessing with edgefirst-hal:

pip install edgefirst-ara2[hal]

Prerequisites for Development

Python 3.11 or higher
Rust stable toolchain (edition 2024)
maturin (pip install maturin)
ARA-2 client library (libaraclient.so.1)

Development Install

cd crates/ara2-py
maturin develop --release --features abi3

Quick Start

import edgefirst_ara2

# Connect to ARA-2 proxy
session = edgefirst_ara2.Session.create_via_unix_socket("/var/run/ara2.sock")

# Get version information
versions = session.versions()
print(f"Proxy version: {versions['proxy']}")

# List endpoints
endpoints = session.list_endpoints()
print(f"Found {len(endpoints)} endpoints")

# Check endpoint status
for endpoint in endpoints:
    state = endpoint.check_status()
    stats = endpoint.dram_statistics()
    print(f"State: {state}, Free DRAM: {stats.free_size / stats.dram_size * 100:.1f}%")

Inference with numpy

import numpy as np
import edgefirst_ara2

session = edgefirst_ara2.Session.create_via_unix_socket("/var/run/ara2.sock")
endpoints = session.list_endpoints()
model = endpoints[0].load_model("model.dvm")

# Allocate tensors and run inference
model.allocate_tensors()
input_data = np.zeros(model.input_size(0), dtype=np.uint8)
model.set_input_tensor(0, input_data)
timing = model.run()

print(f"Inference: {timing.run_time_us} us")
output = model.get_output_tensor(0)
dequantized = model.dequantize(0)

Zero-Copy DMA-BUF Pipeline

For maximum throughput, use DMA-BUF tensors with edgefirst-hal for GPU-accelerated preprocessing. This eliminates CPU memory copies between preprocessing and inference:

Path	CPU copies	Flow
Standard (numpy)	2	numpy → shared memory → NPU
DMA-BUF	0	GPU writes directly to NPU input buffer

How it works: allocate_tensors("dma") allocates the model's input tensor in a DMA-BUF — a Linux kernel buffer accessible by multiple hardware devices. input_tensor_fd(0) returns a file descriptor to that buffer. You pass this FD to edgefirst_hal.import_image(), which maps it as a GPU image surface. The GPU writes the preprocessed frame directly into the NPU's input buffer — no CPU copies involved.

import os
import edgefirst_ara2 as ara2
import edgefirst_hal as hal

session = ara2.Session.create_via_unix_socket(ara2.DEFAULT_SOCKET)
endpoint = session.list_endpoints()[0]

with endpoint.load_model("yolov8s.dvm") as model:
    model.allocate_tensors("dma")  # Must use "dma" for tensor FD access

    # Get DMA-BUF FD for the model's input tensor
    input_fd = model.input_tensor_fd(0)
    c, h, w = model.input_shape(0)
    try:
        # Import as PlanarRgb (CHW layout) to match ARA-2 tensor format
        dst = hal.import_image(input_fd, w, h, hal.PixelFormat.PlanarRgb)
    finally:
        os.close(input_fd)  # FD duplicated by import_image; close original

    # GPU-accelerated convert: camera frame -> model input (zero CPU copies)
    processor = hal.ImageProcessor()
    src = hal.load_image("image.jpg", format=hal.PixelFormat.Rgba, mem=hal.TensorMemory.DMA)
    processor.convert(src, dst)

    # Run inference — NPU reads from the same DMA-BUF
    timing = model.run()
    print(f"Inference: {timing.run_time_us} us")

Performance

Benchmarked on NXP FRDM i.MX 95 + ARA-2 with YOLOv8m-seg (640×640). The Python API adds minimal overhead over native Rust thanks to DMA-BUF zero-copy — GPU and NPU operate on the same physical memory buffers.

Stage	Rust	Python	Overhead
GPU preprocess (letterbox + RGBA→CHW)	2.85 ms	2.88 ms	+0.03 ms
NPU inference (wall clock)	34.53 ms	34.63 ms	+0.10 ms
NPU execution	26.04 ms	26.04 ms	—
DMA input upload	2.02 ms	2.05 ms	—
DMA output download	3.68 ms	3.68 ms	—
Decode (NMS + dequant)	4.05 ms	4.31 ms	+0.26 ms
Materialize (CPU coeff × proto → bitmaps)	5.67 ms	5.98 ms	+0.31 ms
Draw (GL mask overlay)	5.54 ms	5.71 ms	+0.17 ms
Total pipeline	52.64 ms	53.52 ms	+0.88 ms
Throughput	19.0 FPS	18.7 FPS

Steady-state mean over 30 iterations after warmup. Python overhead is under 1 ms across the entire pipeline.

Run the benchmark yourself. Create a virtual environment on the target and install the packages from PyPI:

python3 -m venv ~/venv
~/venv/bin/pip install edgefirst-ara2 edgefirst-hal
~/venv/bin/python3 yolov8.py model.dvm image.jpg --benchmark 30 --save

DVM Metadata

Read model metadata without loading onto the NPU:

import edgefirst_ara2

metadata = edgefirst_ara2.read_metadata("model.dvm")
if metadata:
    print(f"Task: {metadata.task}")
    print(f"Classes: {metadata.classes}")
    if metadata.compilation and metadata.compilation.ppa:
        print(f"IPS: {metadata.compilation.ppa.ips}")

labels = edgefirst_ara2.read_labels("model.dvm")

Async Inference

The submit() / wait() API enables overlapping CPU work with NPU execution. This is the building block for pipeline parallelism — while the NPU runs inference on one frame, the CPU can preprocess the next.

import edgefirst_ara2 as ara2

session = ara2.Session.create_via_unix_socket(ara2.DEFAULT_SOCKET)
endpoint = session.list_endpoints()[0]
model = endpoint.load_model("model.dvm")
model.allocate_tensors()

# Submit — returns immediately while NPU works
request = model.submit()
print(f"Request #{request.request_id} submitted")

# CPU is free to do other work here...
# The GIL is NOT held during wait(), so other Python threads can run

timing = request.wait()  # blocks until NPU finishes
print(f"NPU inference: {timing.run_time_us} µs")

# Monitor pipeline depth
print(f"In-flight: {session.inflight_count()}")

Warning: Do not call model.allocate_tensors() while an InferRequest is still pending — the NPU is actively reading/writing the tensor buffers.

API Reference

Session

Connection to the ARA-2 proxy service.

Static Methods:

create_via_unix_socket(socket_path: str) -> Session
create_via_tcp_ipv4_socket(ip: str, port: int) -> Session

Methods:

versions() -> dict[str, str] - Get component versions
list_endpoints() -> list[Endpoint] - List available endpoints
inflight_count() -> int - Number of pending async inference requests

Properties:

socket_type: str - "unix" or "tcp"

Endpoint

Represents an ARA-2 accelerator device.

Methods:

check_status() -> State - Get device state
dram_statistics() -> DramStatistics - Get memory usage
load_model(model_path: str) -> Model - Load a .dvm model

Model

Loaded neural network model.

Lifecycle:

allocate_tensors(memory: str | None = None) - Allocate tensors ("dma", "shm", "mem", or None)
set_timeout_ms(timeout_ms: int) - Set inference timeout
run() -> ModelTiming - Execute inference synchronously
submit() -> InferRequest - Submit inference asynchronously (returns immediately)

Tensor I/O (numpy):

set_input_tensor(index: int, data: np.ndarray) - Copy data into input
get_output_tensor(index: int) -> np.ndarray - Copy output data out
dequantize(index: int) -> np.ndarray - Dequantize output to float32

DMA-BUF Zero-Copy:

input_tensor_fd(index: int) -> int - Get input tensor FD (pass to hal.ImageProcessor.import_image, which dups it — close after)
output_tensor_fd(index: int) -> int - Get output tensor FD (pass to hal.Tensor.from_fd, which takes ownership — do not close after)
input_tensor_memory(index: int) -> str - Input memory type
output_tensor_memory(index: int) -> str - Output memory type

Introspection:

n_inputs: int, n_outputs: int - Tensor counts
input_shape(i) -> (C, H, W), output_shape(i) -> (C, H, W)
input_size(i) -> int, output_size(i) -> int - Size in bytes
input_bpp(i) -> int, output_bpp(i) -> int - Bytes per element
input_info(i) -> InputTensorInfo, output_info(i) -> OutputTensorInfo
input_quants(i) -> InputQuantization, output_quants(i) -> OutputQuantization

InferRequest

Pending asynchronous inference request, created by Model.submit().

Methods:

wait(timeout_ms: int = 1000) -> ModelTiming - Block until complete (GIL released)

Properties:

request_id: int - Proxy-assigned ID for log correlation

Metadata Functions

read_metadata(path: str) -> DvmMetadata | None
read_labels(path: str) -> list[str]
has_metadata(path: str) -> bool

Supporting Types

State (enum): Init, Idle, Active, ActiveSlow, ActiveBoosted, ThermalInactive, ThermalUnknown, Inactive, Fault
ModelOutputType (enum): Classification, Detection, SemanticSegmentation, Raw
DramStatistics: dram_size, free_size, model_occupancy_size, ...
ModelTiming: run_time_us, input_time_us, output_time_us
InputQuantization: qn, scale, mean, is_signed
OutputQuantization: qn, scale, offset, is_signed

Exceptions

Ara2Error (RuntimeError)
 +-- LibraryError       - libaraclient.so loading failures
 +-- HardwareError      - NPU faults, endpoint errors
 +-- ProxyError         - Proxy connection failures
 +-- ModelError         - Model load/inference failures
 +-- TensorError        - Tensor allocation, DMA-BUF errors
 +-- MetadataError      - DVM metadata parsing errors

Building Wheels

cd crates/ara2-py
maturin build --release --features abi3

Wheels are created in target/wheels/.

Stable ABI

The bindings use PyO3's stable ABI (abi3-py311):

A single wheel works across Python 3.11, 3.12, 3.13, and future versions
Minimum supported Python version is 3.11

Troubleshooting

"libaraclient.so.1 not found"

export LD_LIBRARY_PATH=/path/to/ara2/lib:$LD_LIBRARY_PATH

Verify Installation

python -c "import edgefirst_ara2; print(edgefirst_ara2.__version__)"

License

Licensed under the Apache License 2.0.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

au-zone

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

0.12.0

Jun 16, 2026

This version

0.11.2

May 29, 2026

0.11.0

May 26, 2026

0.10.0

May 18, 2026

0.9.0

May 12, 2026

0.8.0

May 8, 2026

0.7.0

May 7, 2026

0.6.0

May 6, 2026

0.5.0

Apr 26, 2026

0.4.0

Apr 13, 2026

0.3.0

Apr 11, 2026

0.2.0

Mar 26, 2026

0.1.3

Mar 26, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

edgefirst_ara2-0.11.2.tar.gz (109.0 kB view details)

Uploaded May 29, 2026 Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

edgefirst_ara2-0.11.2-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (561.9 kB view details)

Uploaded May 29, 2026 CPython 3.11+manylinux: glibc 2.17+ x86-64

edgefirst_ara2-0.11.2-cp311-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (546.5 kB view details)

Uploaded May 29, 2026 CPython 3.11+manylinux: glibc 2.17+ ARM64

File details

Details for the file edgefirst_ara2-0.11.2.tar.gz.

File metadata

Download URL: edgefirst_ara2-0.11.2.tar.gz
Upload date: May 29, 2026
Size: 109.0 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for edgefirst_ara2-0.11.2.tar.gz
Algorithm	Hash digest
SHA256	`793e70a033e5fbf8ac333fffb307d10464bb591d5608fc1c9958414174dc0a38`
MD5	`9355206007beb5a53b3396fbf7d425ec`
BLAKE2b-256	`3eee420b3531d3095fbac9e4ae6b8ab364c0621e5bb0ee63e4a8a046e41cd96b`

See more details on using hashes here.

Provenance

The following attestation bundles were made for edgefirst_ara2-0.11.2.tar.gz:

Publisher: release.yml on EdgeFirstAI/ara2-rs

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: edgefirst_ara2-0.11.2.tar.gz
- Subject digest: 793e70a033e5fbf8ac333fffb307d10464bb591d5608fc1c9958414174dc0a38
- Sigstore transparency entry: 1671647254
- Sigstore integration time: May 29, 2026
Source repository:
- Permalink: EdgeFirstAI/ara2-rs@028266dfa17c59dfd199f501adf21b6cb5238ecd
- Branch / Tag: refs/tags/v0.11.2
- Owner: https://github.com/EdgeFirstAI
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@028266dfa17c59dfd199f501adf21b6cb5238ecd
- Trigger Event: push

File details

Details for the file edgefirst_ara2-0.11.2-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

Download URL: edgefirst_ara2-0.11.2-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Upload date: May 29, 2026
Size: 561.9 kB
Tags: CPython 3.11+, manylinux: glibc 2.17+ x86-64
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for edgefirst_ara2-0.11.2-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm	Hash digest
SHA256	`305077b1384fae8e1b50d0ec2096b08041070004f8268527f16ae2f7c49e56d4`
MD5	`28f38181382c66c126e587ebb047227c`
BLAKE2b-256	`135fbf5534d9a4abf30a55ad55f609dbb2dc5344498dc5208d8ed78bd1e493aa`

See more details on using hashes here.

Provenance

The following attestation bundles were made for edgefirst_ara2-0.11.2-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: release.yml on EdgeFirstAI/ara2-rs

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: edgefirst_ara2-0.11.2-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Subject digest: 305077b1384fae8e1b50d0ec2096b08041070004f8268527f16ae2f7c49e56d4
- Sigstore transparency entry: 1671647259
- Sigstore integration time: May 29, 2026
Source repository:
- Permalink: EdgeFirstAI/ara2-rs@028266dfa17c59dfd199f501adf21b6cb5238ecd
- Branch / Tag: refs/tags/v0.11.2
- Owner: https://github.com/EdgeFirstAI
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@028266dfa17c59dfd199f501adf21b6cb5238ecd
- Trigger Event: push

File details

Details for the file edgefirst_ara2-0.11.2-cp311-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

Download URL: edgefirst_ara2-0.11.2-cp311-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Upload date: May 29, 2026
Size: 546.5 kB
Tags: CPython 3.11+, manylinux: glibc 2.17+ ARM64
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for edgefirst_ara2-0.11.2-cp311-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm	Hash digest
SHA256	`d633d45de9e7d7748b5069ad36b88aa42a137a92438827057755b8f6d92e213c`
MD5	`67ae1dc38a6c79a07c9be8b249a996b8`
BLAKE2b-256	`121d5ad2483f3387e1234a9d40505f9f091565401f870444588a632565b4c44c`

See more details on using hashes here.

Provenance

The following attestation bundles were made for edgefirst_ara2-0.11.2-cp311-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:

Publisher: release.yml on EdgeFirstAI/ara2-rs

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: edgefirst_ara2-0.11.2-cp311-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
- Subject digest: d633d45de9e7d7748b5069ad36b88aa42a137a92438827057755b8f6d92e213c
- Sigstore transparency entry: 1671647272
- Sigstore integration time: May 29, 2026
Source repository:
- Permalink: EdgeFirstAI/ara2-rs@028266dfa17c59dfd199f501adf21b6cb5238ecd
- Branch / Tag: refs/tags/v0.11.2
- Owner: https://github.com/EdgeFirstAI
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@028266dfa17c59dfd199f501adf21b6cb5238ecd
- Trigger Event: push

edgefirst-ara2 0.11.2

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Project description

Python Bindings for ARA-2

Architecture

Installation

From PyPI

Prerequisites for Development

Development Install

Quick Start

Inference with numpy

Zero-Copy DMA-BUF Pipeline

Performance

DVM Metadata

Async Inference

API Reference

Session

Endpoint

Model

InferRequest

Metadata Functions

Supporting Types

Exceptions

Building Wheels

Stable ABI

Troubleshooting

"libaraclient.so.1 not found"

Verify Installation

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distributions

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance