Skip to main content

Asynchronous, embedded, modern DB based on SQLite.

Project description

beaver 🦫

A fast, single-file, multi-modal database for Python, built with the standard sqlite3 library.

beaver is the Backend for Embedded Asynchronous Vector & Event Retrieval. It's an industrious, all-in-one database designed to manage complex, modern data types without requiring a database server.

Design Philosophy

beaver is built with a minimalistic philosophy for small, local use cases where a full-blown database server would be overkill.

  • Minimalistic & Zero-Dependency: Uses only Python's standard libraries (sqlite3, asyncio) and numpy.
  • Async-First (When It Matters): The pub/sub system is fully asynchronous for high-performance, real-time messaging. Other features like key-value, list, and vector operations are synchronous for ease of use.
  • Built for Local Applications: Perfect for local AI tools, RAG prototypes, chatbots, and desktop utilities that need persistent, structured data without network overhead.
  • Fast by Default: It's built on SQLite, which is famously fast and reliable for local applications.

Core Features

  • Asynchronous Pub/Sub: A fully asynchronous, Redis-like publish-subscribe system for real-time messaging.
  • Persistent Key-Value Store: A simple set/get interface for storing any JSON-serializable object.
  • Pythonic List Management: A fluent, Redis-like interface for managing persistent, ordered lists.
  • Vector Storage & Search: Store vector embeddings and perform simple, brute-force k-nearest neighbor searches, ideal for small-scale RAG.
  • Single-File & Portable: All data is stored in a single SQLite file, making it incredibly easy to move, back up, or embed in your application.

Installation

pip install beaver-db

Quickstart & API Guide

Initialization

All you need to do is import and instantiate the BeaverDB class with a file path.

from beaver import BeaverDB, Document

db = BeaverDB("my_application.db")

Key-Value Store

Use set() and get() for simple data storage. The value can be any JSON-encodable object.

# Set a value
db.set("app_config", {"theme": "dark", "user_id": 123})

# Get a value
config = db.get("app_config")
print(f"Theme: {config['theme']}") # Output: Theme: dark

List Management

Get a list wrapper with db.list() and use Pythonic methods to manage it.

tasks = db.list("daily_tasks")
tasks.push("Write the project report")
tasks.prepend("Plan the day's agenda")
print(f"The first task is: {tasks[0]}")

Vector Storage & Search

Store Document objects containing vector embeddings and metadata. The search is a linear scan, which is sufficient for small-to-medium collections.

# Get a handle to a collection
docs = db.collection("my_documents")

# Create and index a document (ID will be a UUID)
doc1 = Document(embedding=[0.1, 0.2, 0.7], text="A cat sat on the mat.")
docs.index(doc1)

# Create and index a document with a specific ID (for upserting)
doc2 = Document(id="article-42", embedding=[0.9, 0.1, 0.1], text="A dog chased a ball.")
docs.index(doc2)

# Search for the 2 most similar documents
query_vector = [0.15, 0.25, 0.65]
results = docs.search(vector=query_vector, top_k=2)

# Results are a list of (Document, distance) tuples
top_document, distance = results[0]
print(f"Closest document: {top_document.text} (distance: {distance:.4f})")

Asynchronous Pub/Sub

Publish events from one part of your app and listen in another using asyncio.

import asyncio

async def listener():
    async with db.subscribe("system_events") as sub:
        async for message in sub:
            print(f"LISTENER: Received event -> {message['event']}")

async def publisher():
    await asyncio.sleep(1)
    await db.publish("system_events", {"event": "user_login", "user": "alice"})

# To run them concurrently:
# asyncio.run(asyncio.gather(listener(), publisher()))

Roadmap

beaver aims to be a complete, self-contained data toolkit. The following features are planned:

  • More Efficient Vector Search: Integrate an approximate nearest neighbor (ANN) index like scipy.spatial.cKDTree to improve search speed on larger datasets.
  • JSON Document Store with Full-Text Search: Store flexible JSON documents and get powerful full-text search across all text fields, powered by SQLite's FTS5 extension.
  • Standard Relational Interface: While beaver provides high-level features, you can always use the same SQLite file for normal relational tasks with standard SQL.

License

This project is licensed under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

beaver_db-0.4.0.tar.gz (10.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

beaver_db-0.4.0-py3-none-any.whl (8.7 kB view details)

Uploaded Python 3

File details

Details for the file beaver_db-0.4.0.tar.gz.

File metadata

  • Download URL: beaver_db-0.4.0.tar.gz
  • Upload date:
  • Size: 10.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.5.13

File hashes

Hashes for beaver_db-0.4.0.tar.gz
Algorithm Hash digest
SHA256 433b037d2b2ef0596b92ed5451e174ed3a201872471201909707a2721d416085
MD5 cff974955cd5be905827e89c05bb00b4
BLAKE2b-256 30b13a4d27c8ebce7341d602cc8412add2e6fa614325c075a9d7039b697b9626

See more details on using hashes here.

File details

Details for the file beaver_db-0.4.0-py3-none-any.whl.

File metadata

  • Download URL: beaver_db-0.4.0-py3-none-any.whl
  • Upload date:
  • Size: 8.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.5.13

File hashes

Hashes for beaver_db-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ff3631217f1adf5a269a46fdce0f061f43d902bffb4be477be9e8b7ce75243a7
MD5 709ab6d34443d65c4a1682f58913819f
BLAKE2b-256 401f3b330178a469b869049bfd2b3a6e3fa4daa092efb0852e2c32bae361a8aa

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page