Embedded vector database + living context engine — multimodal pockets, context graph, adaptive decay, MCP server
Project description
Feather DB
Embedded vector database + living context engine
Part of Hawky.ai — AI-Native Digital Marketing OS
Feather DB is an embedded vector database and living context engine — zero-server, file-based, with a built-in knowledge graph and adaptive memory decay. No separate database server required.
What's Inside (v0.5.0)
| Capability | Description |
|---|---|
| ANN Search | Sub-millisecond approximate nearest-neighbor search via HNSW |
| Multimodal Pockets | Text, image, audio vectors stored per entity under a single ID |
| Context Graph | Typed + weighted edges, reverse index, auto-link by similarity |
| Living Context | Recall-count-based sticky memory — frequently accessed items resist decay |
| Namespace / Entity / Attributes | Generic partition + subject + KV metadata for any domain |
| Graph Visualizer | Self-contained D3 force-graph HTML — fully offline, no CDN |
| Single-file persistence | .feather binary format (v5); v3/v4 files load transparently |
Installation
pip install feather-db
CLI (Rust):
cargo install feather-db-cli
Build from source:
git clone https://github.com/feather-store/feather
cd feather
python setup.py build_ext --inplace
Quick Start
import feather_db
import numpy as np
# Open or create a database
db = feather_db.DB.open("context.feather", dim=768)
# Add a vector with metadata
meta = feather_db.Metadata()
meta.content = "User prefers dark mode"
meta.importance = 0.9
db.add(id=1, vec=np.random.rand(768).astype(np.float32), meta=meta)
# Semantic search
results = db.search(np.random.rand(768).astype(np.float32), k=5)
for r in results:
print(r.id, r.score, r.metadata.content)
db.save()
Core Features
Multimodal Pockets
Each named modality gets its own independent HNSW index with its own dimensionality. A single entity ID can hold text, visual, and audio vectors simultaneously.
db.add(id=42, vec=text_vec, modality="text") # 768-dim
db.add(id=42, vec=image_vec, modality="visual") # 512-dim
db.add(id=42, vec=audio_vec, modality="audio") # 256-dim
results = db.search(query_vec, k=10, modality="visual")
Context Graph
Typed, weighted edges between records. Nine built-in relationship types plus free-form strings.
from feather_db import RelType
# Link records with typed relationships
db.link(from_id=1, to_id=2, rel_type=RelType.CAUSED_BY, weight=0.9)
db.link(from_id=1, to_id=3, rel_type=RelType.SUPPORTS, weight=0.7)
# Query graph structure
edges = db.get_edges(1) # outgoing edges
incoming = db.get_incoming(2) # reverse index
# Auto-create edges by vector similarity
db.auto_link(modality="text", threshold=0.85, rel_type=RelType.RELATED_TO)
Built-in relationship types: related_to, derived_from, caused_by, contradicts, supports, precedes, part_of, references, multimodal_of.
Context Chain (Vector Search + Graph Expansion)
One call that combines semantic vector search with n-hop BFS graph traversal:
result = db.context_chain(
query=query_vec,
k=5, # seed nodes from vector search
hops=2, # BFS graph expansion depth
modality="text"
)
for node in result.nodes:
print(node.id, node.score, node.hop_distance)
for edge in result.edges:
print(edge.source_id, "->", edge.target_id, edge.rel_type)
Score = similarity × hop_decay × importance × stickiness
Namespace / Entity / Attributes
Generic partitioning for multi-tenant, multi-domain use:
from feather_db import FilterBuilder, MarketingProfile
# Build metadata with domain profile
profile = feather_db.MarketingProfile()
profile.set_brand("nike")
profile.set_user("user_8821")
profile.set_channel("instagram")
profile.set_ctr(0.045)
meta = profile.to_metadata()
db.add(id=100, vec=vec, meta=meta)
# Filter by namespace + entity + attribute
f = FilterBuilder().namespace("nike").entity("user_8821").attribute("channel", "instagram").build()
results = db.search(query_vec, k=10, filter=f)
Works for any domain — healthcare, e-commerce, finance — by subclassing DomainProfile.
Living Context / Adaptive Decay
Records accessed more frequently resist temporal decay:
from feather_db import ScoringConfig
cfg = ScoringConfig(half_life=30.0, weight=0.3, min=0.0)
results = db.search(query_vec, k=10, scoring=cfg)
Formula:
stickiness = 1 + log(1 + recall_count)
effective_age = age_in_days / stickiness
recency = 0.5 ^ (effective_age / half_life_days)
final_score = ((1 - time_weight) * similarity + time_weight * recency) * importance
touch() is called automatically on every search hit. Call db.touch(id) manually to boost salience.
Graph Visualization
Exports a self-contained, offline D3 force-graph HTML — no CDN, no server:
from feather_db.graph import visualize, export_graph
# Interactive HTML force graph
visualize(db, output_path="/tmp/graph.html")
# JSON for D3 / Cytoscape (namespace-filtered)
data = export_graph(db, namespace_filter="nike")
Import / Export
# D3 / Cytoscape-compatible JSON
json_str = db.export_graph_json(namespace_filter="nike", entity_filter="user_8821")
# Raw vector retrieval
vec = db.get_vector(id=42, modality="text")
ids = db.get_all_ids(modality="visual")
# Metadata update without touching HNSW index
db.update_metadata(id=42, meta=new_meta)
db.update_importance(id=42, importance=0.95)
Filtered Search
from feather_db import FilterBuilder
results = db.search(
query_vec, k=10,
filter=FilterBuilder()
.namespace("nike")
.entity("user_8821")
.attribute("channel", "instagram")
.source("pipeline-v1")
.importance_gte(0.5)
.build()
)
Metadata Fields
meta = feather_db.Metadata()
meta.timestamp = int(time.time()) # Unix timestamp
meta.importance = 0.9 # [0.0–1.0]
meta.type = feather_db.ContextType.FACT # FACT | PREFERENCE | EVENT | CONVERSATION
meta.source = "pipeline-v1"
meta.content = "Human-readable content"
meta.tags_json = '["tag1","tag2"]'
meta.namespace_id = "nike" # partition key
meta.entity_id = "user_8821" # subject key
meta.set_attribute("channel", "instagram") # safe KV setter (use this, not meta.attributes['k']=v)
val = meta.get_attribute("channel")
Rust CLI
# Add a record
feather add --db my.feather --id 1 --vec "0.1,0.2,0.3" --modality text
# Search
feather search --db my.feather --vec "0.1,0.2,0.3" --k 5
# Link two records
feather link --db my.feather --from 1 --to 2
Performance
| Metric | Value |
|---|---|
| Add rate | 2,000–5,000 vectors/sec |
| Search latency (k=10) | 0.5–1.5 ms |
| Max vectors per modality | 1,000,000 (configurable) |
| HNSW params | M=16, ef_construction=200 |
| File format | Binary .feather v5 |
SIMD (AVX2/AVX512) optimizations are available in space_l2.h. Enable with -DUSE_AVX -march=native in setup.py.
File Format
[magic: 4B = "FEAT"] [version: 4B = 5]
--- Metadata Section ---
[meta_count: 4B]
for each record:
[id: 8B] [serialized Metadata including namespace/entity/attributes/edges]
--- Modality Indices Section ---
[modal_count: 4B]
for each modality:
[name_len: 2B] [name: N bytes]
[dim: 4B] [element_count: 4B]
for each element:
[id: 8B] [float32 vector: dim * 4 bytes]
v3 and v4 files load transparently — missing fields default to empty.
Examples
| File | Description |
|---|---|
examples/context_graph_demo.py |
Full context graph demo — auto-link, context_chain, D3 HTML export |
examples/marketing_living_context.py |
Multi-brand namespace/entity/attribute filtering + importance feedback |
examples/feather_inspector.py |
Local HTTP inspector — force graph, PCA scatter, edit, delete |
Run any example:
python setup.py build_ext --inplace
python3 examples/context_graph_demo.py
Architecture
[Generic Core — C++17]
feather::DB
├── modality_indices_ (unordered_map<string, ModalityIndex>) — one HNSW per modality
├── metadata_store_ (unordered_map<uint64_t, Metadata>) — shared metadata by ID
└── Methods: add, search, link, context_chain, auto_link, export_graph_json ...
[Python Layer]
feather_db (pybind11)
├── DB, Metadata, ContextType, ScoringConfig
├── Edge, IncomingEdge, ContextNode, ContextEdge, ContextChainResult
├── FilterBuilder — fluent search filter helper
├── DomainProfile — generic namespace/entity/attributes base class
├── MarketingProfile — digital marketing typed adapter
├── RelType — standard relationship type constants
└── graph.visualize() — D3 force-graph HTML exporter
[Rust CLI]
feather-db-cli (FFI via extern "C" from src/feather_core.cpp)
Known Limitations
| Issue | Detail |
|---|---|
| No concurrent writes | HNSW is not thread-safe for simultaneous adds |
| No vector deletion | HNSW marks deletions; data stays until compaction |
| Max 1M vectors/modality | Hardcoded in get_or_create_index; increase max_elements to raise |
meta.attributes['k'] = v silent no-op |
pybind11 map copy; use meta.set_attribute(k, v) |
| tags_json is raw string | Tag filtering uses substring search, not proper JSON parsing |
Contributing
- Fork the repository
- Create a feature branch
- Make your changes with tests
- Submit a pull request
See CONTRIBUTING.md for details.
License
MIT — see LICENSE
Acknowledgments
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file feather_db-0.6.1.tar.gz.
File metadata
- Download URL: feather_db-0.6.1.tar.gz
- Upload date:
- Size: 1.1 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0cf861bec6755d3816d22c1257d66e35060a1118a93f8ef96d662be7ce053930
|
|
| MD5 |
5c8ae2c4b0784f44cba80d17392b63a6
|
|
| BLAKE2b-256 |
bf7b9e87da6f4ece69cd13856c277c686c2ce4bed855308bfc520d2ecd25a05c
|
File details
Details for the file feather_db-0.6.1-cp312-cp312-macosx_15_0_arm64.whl.
File metadata
- Download URL: feather_db-0.6.1-cp312-cp312-macosx_15_0_arm64.whl
- Upload date:
- Size: 1.1 MB
- Tags: CPython 3.12, macOS 15.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
22476cd61e833c938f7c1f7d5af119e24786ff63d1d51706c6271da642bbc9cc
|
|
| MD5 |
c7df256f607fe0ba2832f52f7cb32148
|
|
| BLAKE2b-256 |
8cc037d1513c9e5ae813872f6f441327a225c76a0feb49d55ba31004536b97c6
|