Python binding for the obj embedded document database.
Project description
obj — Python binding
Python bindings for obj, the
embedded document database.
The wheel exposes a single extension module named obj. The Rust
crate name is obj-py; the import name is obj.
import obj
with obj.Db("app.obj") as db:
with db.transaction() as tx:
doc_id = tx.insert("orders", b"<your payload bytes>")
with db.read_transaction() as tx:
payload = tx.get("orders", doc_id)
for (id_, bytes_) in tx.iter_all("orders"):
...
Payload contract
obj-py ships two Python surfaces side by side:
- Bytes API on
WriteTxn/ReadTxn. Payloads cross the boundary asbytes/bytearrayin andbytesout. The library does NOT serialise dicts, dataclasses, or JSON for you on this path — encode your payloads however you like (json,msgpack,postcard,pickle, ...) and pass the resulting bytes through. This mirrors the obj C ABI's contract. - Typed-document API on
DbandWriteTxn(Phase 6.5 + issue #1). Wrap a@dataclasswith@obj.document(collection="orders", version=1)and the ergonomic methodsdb.insert(order)/db.get(Order, id)/db.update(Order, id, fn)/db.all(Order)route through a schema-drivenDynamiccodec that produces postcard bytes byte-identical to Rust's#[derive(Document)]writer for the same logical schema.db.update(...)is an atomic read-modify-write: the read and the write-back happen inside one write transaction (no lost-update window), and a raisingfnrolls the change back.
from dataclasses import dataclass
import obj
@obj.document(collection="orders", version=1)
@dataclass
class Order:
customer_id: int
total: float
status: str
with obj.Db("app.obj") as db:
doc_id = db.insert(Order(customer_id=1, total=99.5, status="pending"))
order = db.get(Order, doc_id)
for (oid, o) in db.all(Order):
...
Typed docs inside an explicit transaction
WriteTxn overloads its CRUD methods by argument type, so typed
documents compose with explicit transactions. Pass a @obj.document
instance (or class) for the typed path, or a collection str plus
bytes for the raw path. This lets you batch many typed writes into
a single commit / single WAL fsync instead of one transaction per
db.insert:
with obj.Db("app.obj") as db:
with db.transaction() as tx:
for i in range(1000):
tx.insert(Order(customer_id=i, total=float(i), status="new"))
# one commit + one fsync for the whole batch on __exit__
# reads inside the txn see its own uncommitted writes:
first = tx.get(Order, 1)
tx.update(Order, 1, lambda o: setattr(o, "status", "shipped"))
tx.upsert(Order, 2, Order(customer_id=2, total=2.0, status="done"))
tx.delete(Order, 3)
# the raw-bytes overload still works on the same handle:
tx.insert("audit_log", b"<raw bytes>")
The typed WriteTxn methods reuse the exact encode/decode pipeline
that Db uses, so on-disk bytes are identical regardless of which
surface wrote them. Passing a value that is not a @obj.document
to the typed path raises obj.InvalidArgumentError with a clear
message.
For ad-hoc dict-shaped writes (no @document boilerplate), call
the same CRUD methods with a collection str as the first argument
(dict-native overload):
doc_id = db.insert("events", {"event": "click", "user_id": 42})
event = db.get("events", doc_id)
Per-document lazy migration mirrors Rust's Migrate trait via a
history=[...] arg and a cls.migrate(doc, from_version)
classmethod.
Checkpointing
Writes land in a write-ahead log (<db>.obj-wal) first; the main
<db>.obj file stays sparse until a checkpoint folds the committed
WAL pages into it and resets the WAL back to its 64-byte header. A
checkpoint fires automatically once the WAL reaches ~1000 frames, but
after a handful of writes the data lives entirely in the -wal file.
Call db.checkpoint() to fold it on demand:
with obj.Db("app.obj") as db:
for note in notes:
db.insert(note)
db.checkpoint() # fold the WAL into app.obj, reset app.obj-wal
checkpoint() is a harmless no-op when there is nothing to fold, and
is deferred (partial / no-op) if a concurrent reader has pinned a
snapshot below the end of the WAL — the frames that reader still needs
stay in place. Retry once the reader has finished. It raises
obj.ObjError on a read-only handle or on an I/O failure.
Checkpoint on clean close
You usually do not need an explicit checkpoint(): a clean
close() — including a with obj.Db(...) as db: block that exits
without raising — folds the WAL into the main file for you, so the
.obj file is self-contained after a normal shutdown.
with obj.Db("app.obj") as db:
db.insert(note)
# block exited cleanly -> WAL folded into app.obj, app.obj-wal reset
The close-time checkpoint is best-effort and non-fatal: a failure
(reader-pinned deferral, I/O error during shutdown, read-only handle)
is swallowed and never turns a successful with block into a raised
error — the committed data is already durable in the WAL, so a failed
fold loses nothing.
If the block exits via an exception, the checkpoint is skipped and the exception is propagated unchanged — the close-time fold never masks your error.
Trade-off: every clean close ends in an fsync. If you open and
close many short-lived handles on a hot path, that is one fsync per
close; prefer a single long-lived handle (and an occasional explicit
checkpoint()) when the per-close fsync is a bottleneck.
Local development loop
# One-time setup: a fresh venv + maturin + pytest.
python3 -m venv .venv
source .venv/bin/activate
pip install maturin pytest
# From the workspace root:
cd crates/obj-py
maturin develop # builds the cdylib + installs it editable
# into the active venv.
pytest tests/ -v # run the Python test suite.
maturin develop rebuilds the extension module on every invocation;
the typical dev loop is "edit Rust → maturin develop → pytest".
For a release-style wheel:
maturin build --release # writes target/wheels/obj-*.whl
pip install target/wheels/obj-*.whl
Exception hierarchy
All obj operations raise instances of obj.ObjError. The
sub-exceptions narrow the diagnosis:
| Exception | When raised |
|---|---|
obj.NotFoundError |
document / collection / index / namespace absent |
obj.BusyError |
lock contention (pager mutex, writer lock, cross-process) |
obj.CorruptionError |
on-disk format / checksum / B-tree invariant violation |
obj.IntegrityError |
Db.integrity_check() found at least one failure |
obj.InvalidArgumentError |
caller-side argument problem (encoding, range, type) |
ObjError itself is the catch-all base — subclasses Exception.
Use except obj.ObjError if you don't care which sub-arm fired;
use the narrow ones to recover.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file obj_db-1.0.2.tar.gz.
File metadata
- Download URL: obj_db-1.0.2.tar.gz
- Upload date:
- Size: 646.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.11.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ca42e09b0bbf3c6c8aa6cff0f968d291bf6d1100568b340e4665d6d97bbf79e3
|
|
| MD5 |
b85e2430c789caae9ff2c6456dc23b2a
|
|
| BLAKE2b-256 |
f5fecaa7abd244d05b96c9a32d0a8fd58d8020326e0ce62d334ce6e73ba826d1
|
File details
Details for the file obj_db-1.0.2-cp39-abi3-macosx_11_0_arm64.whl.
File metadata
- Download URL: obj_db-1.0.2-cp39-abi3-macosx_11_0_arm64.whl
- Upload date:
- Size: 570.8 kB
- Tags: CPython 3.9+, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.11.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ef557fe16924081aac42b20c107c039e98e765d9affab25239a92aaed7217fed
|
|
| MD5 |
d2074489c151fba531ce5672f3d2e974
|
|
| BLAKE2b-256 |
7ad7b40ffe4360d3149a03d49fb95f8109b1b14d1ab13b1a31bfa426a66b012b
|