Skip to main content

Asynchronous, embedded, modern DB based on SQLite.

Project description

beaver 🦫

A fast, single-file, multi-modal database for Python, built with the standard sqlite3 library.

beaver is the Backend for Embedded Asynchronous Vector & Event Retrieval. It's an industrious, all-in-one database designed to manage complex, modern data types without requiring a database server.

Design Philosophy

beaver is built with a minimalistic philosophy for small, local use cases where a full-blown database server would be overkill.

  • Minimalistic & Zero-Dependency: Uses only Python's standard libraries (sqlite3, asyncio) and numpy.
  • Async-First (When It Matters): The pub/sub system is fully asynchronous for high-performance, real-time messaging. Other features like key-value, list, and vector operations are synchronous for ease of use.
  • Built for Local Applications: Perfect for local AI tools, RAG prototypes, chatbots, and desktop utilities that need persistent, structured data without network overhead.
  • Fast by Default: It's built on SQLite, which is famously fast and reliable for local applications.

Core Features

  • Asynchronous Pub/Sub: A fully asynchronous, Redis-like publish-subscribe system for real-time messaging.
  • Persistent Key-Value Store: A simple set/get interface for storing any JSON-serializable object.
  • Pythonic List Management: A fluent, Redis-like interface for managing persistent, ordered lists.
  • Vector Storage & Search: Store vector embeddings and perform simple, brute-force k-nearest neighbor searches, ideal for small-scale RAG.
  • Single-File & Portable: All data is stored in a single SQLite file, making it incredibly easy to move, back up, or embed in your application.

Installation

pip install beaver-db

Quickstart & API Guide

Initialization

All you need to do is import and instantiate the BeaverDB class with a file path.

from beaver import BeaverDB, Document

db = BeaverDB("my_application.db")

Key-Value Store

Use set() and get() for simple data storage. The value can be any JSON-encodable object.

# Set a value
db.set("app_config", {"theme": "dark", "user_id": 123})

# Get a value
config = db.get("app_config")
print(f"Theme: {config['theme']}") # Output: Theme: dark

List Management

Get a list wrapper with db.list() and use Pythonic methods to manage it.

tasks = db.list("daily_tasks")
tasks.push("Write the project report")
tasks.prepend("Plan the day's agenda")
print(f"The first task is: {tasks[0]}")

Vector Storage & Search

Store Document objects containing vector embeddings and metadata. The search is a linear scan, which is sufficient for small-to-medium collections.

# Get a handle to a collection
docs = db.collection("my_documents")

# Create and index a document (ID will be a UUID)
doc1 = Document(embedding=[0.1, 0.2, 0.7], text="A cat sat on the mat.")
docs.index(doc1)

# Create and index a document with a specific ID (for upserting)
doc2 = Document(id="article-42", embedding=[0.9, 0.1, 0.1], text="A dog chased a ball.")
docs.index(doc2)

# Search for the 2 most similar documents
query_vector = [0.15, 0.25, 0.65]
results = docs.search(vector=query_vector, top_k=2)

# Results are a list of (Document, distance) tuples
top_document, distance = results[0]
print(f"Closest document: {top_document.text} (distance: {distance:.4f})")

Asynchronous Pub/Sub

Publish events from one part of your app and listen in another using asyncio.

import asyncio

async def listener():
    async with db.subscribe("system_events") as sub:
        async for message in sub:
            print(f"LISTENER: Received event -> {message['event']}")

async def publisher():
    await asyncio.sleep(1)
    await db.publish("system_events", {"event": "user_login", "user": "alice"})

# To run them concurrently:
# asyncio.run(asyncio.gather(listener(), publisher()))

Roadmap

beaver aims to be a complete, self-contained data toolkit. The following features are planned:

  • More Efficient Vector Search: Integrate an approximate nearest neighbor (ANN) index like scipy.spatial.cKDTree to improve search speed on larger datasets.
  • JSON Document Store with Full-Text Search: Store flexible JSON documents and get powerful full-text search across all text fields, powered by SQLite's FTS5 extension.
  • Standard Relational Interface: While beaver provides high-level features, you can always use the same SQLite file for normal relational tasks with standard SQL.

License

This project is licensed under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

beaver_db-0.3.0.tar.gz (7.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

beaver_db-0.3.0-py3-none-any.whl (7.7 kB view details)

Uploaded Python 3

File details

Details for the file beaver_db-0.3.0.tar.gz.

File metadata

  • Download URL: beaver_db-0.3.0.tar.gz
  • Upload date:
  • Size: 7.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.5.13

File hashes

Hashes for beaver_db-0.3.0.tar.gz
Algorithm Hash digest
SHA256 f9cffbdb0b7af00f136eac33ed082a2f14d897bb91dee593f5fa87cf6aa20631
MD5 17a037ab6356f7365e2964652435bbc5
BLAKE2b-256 81d41aceccc14203f648f4b37a97a789e112b0ea58982ec0ad037edc20f0a783

See more details on using hashes here.

File details

Details for the file beaver_db-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: beaver_db-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 7.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.5.13

File hashes

Hashes for beaver_db-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 29a476990f0c74a8d19ea6ff105646dc8bfe9b6923017a5c956a7eea8fd1a877
MD5 4a683d9ce2b03adc78a58c4ee8870271
BLAKE2b-256 f564928d016c03a07a24167e31fd0e52af23cb7c990c0adc324f3601dbb6f2c4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page