Python bindings for Cabinet - Hierarchical Semantic Hashing memory retrieval
Project description
cabinet
Python bindings for Cabinet — a discrete semantic memory retrieval system for AI agents.
Replace 768-dim dense vectors with 20-bit structured integer codes and retrieve on pure CPU with O(log n) B-tree prefix matching.
What is Cabinet?
Cabinet is a memory retrieval engine designed for Agent scenarios where you need to:
- Remember large amounts of text on a laptop or edge device
- Recall relevant snippets fast, without GPU
- Explain why a snippet was retrieved (category → cluster → word, four-level matching)
- Update incrementally without rebuilding the whole index
The core idea is Hierarchical Semantic Hashing (HSH): each word is encoded as a 20-bit structured integer:
┌──────┬─────────┬─────────┐
│ feat │ sim │ abs │
│ 4-bit│ 8-bit │ 8-bit │
└──────┴─────────┴─────────┘
↓ ↓ ↓
POS tag cluster bucket
Retrieval becomes integer prefix matching on B-trees, which is tiny, fast, and fully auditable.
Installation
# Core package (pre-compiled wheels, no Rust needed)
pip install cabinet-hsh
# With optional GUI visualization
pip install cabinet-hsh[gui]
# With document parsing (PDF, DOCX, XLSX)
pip install cabinet-hsh[docs]
# With plotting utilities
pip install cabinet-hsh[plot]
# Development install from source (requires Rust 1.72+)
git clone https://github.com/Sauomore/Cabinet.git
cd Cabinet/cabinet
maturin develop
Quick Start
import cabinet
# Open a memory cabinet (~4MB RAM + single SQLite file)
mem = cabinet.Memory(
path="./agent_memory.db",
precision="light", # light | hybrid | precise
pos_threshold=50, # common-word promotion threshold
max_context=4096, # working-memory window
)
# Insert snippets
mem.insert("用户明天下午3点开会,准备PPT。")
mem.insert("用户喜欢听管弦乐。")
mem.insert("5号楼邻居有梯子,平时放在车库。")
# Query
results = mem.query("会议准备", top_k=5)
for r in results:
level = ["关联", "同类", "同簇", "精确"][r.match_level - 1]
print(f"[{level}] score={r.score:.3f} doc_id={r.doc_id}")
if r.match_level >= 3:
print(f" → {mem.decode(r)}")
# Snapshot and close
mem.snapshot("./backup/agent_memory_2026-07-03.db")
mem.close()
API Overview
cabinet.Memory
Memory(
path: str, # SQLite database path
precision: str, # "light" | "hybrid" | "precise"
pos_threshold: int, # frequent-word promotion threshold
max_context: int, # working-memory capacity in tokens
)
Methods:
insert(text: str) -> int— tokenize, encode, and store a document; returnsdoc_idquery(text: str, top_k: int = 10) -> list[QueryResult]— retrieve top-k matchesdecode(result: QueryResult) -> str | None— decode the original text of a resultsnapshot(dst: str) -> None— copy the database todstclose() -> None— close the database
cabinet.QueryResult
A result object with the following fields:
| Field | Type | Meaning |
|---|---|---|
doc_id |
int |
document ID |
position |
int |
word position inside the document |
score |
float |
relevance score |
match_level |
int |
1=related, 2=same category, 3=same cluster, 4=exact |
Context decoding
from cabinet import decode_context
results = mem.query("借梯子", top_k=3)
for r in results:
text = decode_context(mem, r, mode="sentence")
print(text)
Supported mode values: "paragraph", "sentence", "window", "before", "after", "window_sent".
Supported Platforms
Pre-compiled wheels are provided for:
- Linux: x86_64, aarch64 (manylinux)
- macOS: universal2 (Intel + Apple Silicon)
- Windows: x64, x86
Requires Python ≥ 3.8 (CPython).
Architecture
cabinet (Python API)
└── PyO3 bindings
└── cabinet-core (Rust)
├── cabinet-hsh # 20-bit HSH encoding
├── cabinet-index # B-tree prefix index + LSM
├── cabinet-store # SQLite backend
└── cabinet-router # relevance routing
Three-layer memory model:
- Token Store — raw HSH sequences, append-only WAL buffer
- Archive Index — 16 feature drawers with B-tree (sim, abs) indexes
- Working Memory — LRU hot cache for inference-time hits
When to use Cabinet vs. vector databases
| Scenario | Cabinet | FAISS / Chroma |
|---|---|---|
| Laptop / edge device | ✅ Tiny CPU model | ❌ Needs GPU or large RAM |
| Incremental updates | ✅ Append-only | ❌ Rebuild clusters |
| Explainable retrieval | ✅ Auditable path | ❌ Black-box similarity |
| Semantic similarity | ⚠️ Discrete approximation | ✅ Dense vectors |
Use Cabinet when you need a small, fast, explainable, and incrementally-updatable memory for Agents.
GUI Visualization
If you installed with [gui]:
cabinet-gui
# or
cd cabinet-gui
streamlit run app.py
The GUI includes pages for encoding visualization, memory architecture, retrieval paths, index browser, and an interactive console.
License
MIT OR Apache-2.0
Cabinet — let AI remember, and explain why it remembers.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file cabinet_hsh-0.1.4.tar.gz.
File metadata
- Download URL: cabinet_hsh-0.1.4.tar.gz
- Upload date:
- Size: 59.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.14.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bad63e10e8b0a6994d4a61b94930eaddb6c0c492ccf174955b037058577a242e
|
|
| MD5 |
a5087113c8d55934ccd2c39f394f4ff4
|
|
| BLAKE2b-256 |
3e4789be7569096876b7376fa2b822bee341e28d5b328893514a454e250fc464
|
File details
Details for the file cabinet_hsh-0.1.4-cp38-abi3-win_amd64.whl.
File metadata
- Download URL: cabinet_hsh-0.1.4-cp38-abi3-win_amd64.whl
- Upload date:
- Size: 4.0 MB
- Tags: CPython 3.8+, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.14.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
921d518b5cc106eec94dead265663ecf31e9fc6e22cb5dd30983cfaf6b3953ba
|
|
| MD5 |
dd41688a1b9aafa06e31b11a326ae07e
|
|
| BLAKE2b-256 |
afac059ddc22fdd0b18524d05c9c5740b0226464e1d42131ac7c8dc6d7e14791
|
File details
Details for the file cabinet_hsh-0.1.4-cp38-abi3-win32.whl.
File metadata
- Download URL: cabinet_hsh-0.1.4-cp38-abi3-win32.whl
- Upload date:
- Size: 3.7 MB
- Tags: CPython 3.8+, Windows x86
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.14.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c4a659e0f7b8533f305639555f43cc728605e9bfef93e4b006de218d3c724d51
|
|
| MD5 |
f5c89a4cd66d38ab811f4daf9abe5da8
|
|
| BLAKE2b-256 |
2c0a7700f62c97ea06a353d2ef18c9768db50bd95a80abe06d466a712b7441d5
|
File details
Details for the file cabinet_hsh-0.1.4-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: cabinet_hsh-0.1.4-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 4.3 MB
- Tags: CPython 3.8+, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.14.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fe90d76a00f0f427fbbcfc7bb6a6a90541c4773ffc07e126a36b16fa277a6c6a
|
|
| MD5 |
e04f80303f18b3b18d1ea4a53b43b1e1
|
|
| BLAKE2b-256 |
3a512924b4e07bb78a08b948bc09cdc5708bc70757f659283432ccd735fee81f
|
File details
Details for the file cabinet_hsh-0.1.4-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.
File metadata
- Download URL: cabinet_hsh-0.1.4-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
- Upload date:
- Size: 4.3 MB
- Tags: CPython 3.8+, manylinux: glibc 2.17+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.14.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3e3cc58be5535a9ad7858be6946a6e2dc1ca05517547d22a9d2781eccdc856cb
|
|
| MD5 |
12b07ebfc8a4d38fe88ec44a525e6c54
|
|
| BLAKE2b-256 |
f4b1c2628a9c17c26273cca446e71ac812b107b05aed0a1f220d62b38f82b44a
|
File details
Details for the file cabinet_hsh-0.1.4-cp38-abi3-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl.
File metadata
- Download URL: cabinet_hsh-0.1.4-cp38-abi3-macosx_10_12_x86_64.macosx_11_0_arm64.macosx_10_12_universal2.whl
- Upload date:
- Size: 8.2 MB
- Tags: CPython 3.8+, macOS 10.12+ universal2 (ARM64, x86-64), macOS 10.12+ x86-64, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.14.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f2ccb4a1926732773ebd71a6e0518589854f7b17b4b55bb4e543e1c7b1a35a2b
|
|
| MD5 |
e473bbc041dd7513347b414259960671
|
|
| BLAKE2b-256 |
6b160495965e66216a56770b8752d1d6e7bfbe02d2796e8137a0d3c91c170e1e
|