Skip to main content

Embedded vector database + living context engine — multimodal pockets, context graph, adaptive decay, MCP server

Project description

Feather DB

Embedded vector database + living context engine

Part of Hawky.ai — AI-Native Digital Marketing OS

PyPI Crates.io License: MIT Website

Feather DB is an embedded vector database and living context engine — zero-server, file-based, with a built-in knowledge graph and adaptive memory decay. No separate database server required.


What's Inside (v0.5.0)

Capability Description
ANN Search Sub-millisecond approximate nearest-neighbor search via HNSW
Multimodal Pockets Text, image, audio vectors stored per entity under a single ID
Context Graph Typed + weighted edges, reverse index, auto-link by similarity
Living Context Recall-count-based sticky memory — frequently accessed items resist decay
Namespace / Entity / Attributes Generic partition + subject + KV metadata for any domain
Graph Visualizer Self-contained D3 force-graph HTML — fully offline, no CDN
Single-file persistence .feather binary format (v5); v3/v4 files load transparently

Installation

pip install feather-db

CLI (Rust):

cargo install feather-db-cli

Build from source:

git clone https://github.com/feather-store/feather
cd feather
python setup.py build_ext --inplace

Quick Start

import feather_db
import numpy as np

# Open or create a database
db = feather_db.DB.open("context.feather", dim=768)

# Add a vector with metadata
meta = feather_db.Metadata()
meta.content = "User prefers dark mode"
meta.importance = 0.9
db.add(id=1, vec=np.random.rand(768).astype(np.float32), meta=meta)

# Semantic search
results = db.search(np.random.rand(768).astype(np.float32), k=5)
for r in results:
    print(r.id, r.score, r.metadata.content)

db.save()

Core Features

Multimodal Pockets

Each named modality gets its own independent HNSW index with its own dimensionality. A single entity ID can hold text, visual, and audio vectors simultaneously.

db.add(id=42, vec=text_vec,   modality="text")    # 768-dim
db.add(id=42, vec=image_vec,  modality="visual")  # 512-dim
db.add(id=42, vec=audio_vec,  modality="audio")   # 256-dim

results = db.search(query_vec, k=10, modality="visual")

Context Graph

Typed, weighted edges between records. Nine built-in relationship types plus free-form strings.

from feather_db import RelType

# Link records with typed relationships
db.link(from_id=1, to_id=2, rel_type=RelType.CAUSED_BY, weight=0.9)
db.link(from_id=1, to_id=3, rel_type=RelType.SUPPORTS,  weight=0.7)

# Query graph structure
edges    = db.get_edges(1)          # outgoing edges
incoming = db.get_incoming(2)       # reverse index

# Auto-create edges by vector similarity
db.auto_link(modality="text", threshold=0.85, rel_type=RelType.RELATED_TO)

Built-in relationship types: related_to, derived_from, caused_by, contradicts, supports, precedes, part_of, references, multimodal_of.

Context Chain (Vector Search + Graph Expansion)

One call that combines semantic vector search with n-hop BFS graph traversal:

result = db.context_chain(
    query=query_vec,
    k=5,           # seed nodes from vector search
    hops=2,        # BFS graph expansion depth
    modality="text"
)

for node in result.nodes:
    print(node.id, node.score, node.hop_distance)

for edge in result.edges:
    print(edge.source_id, "->", edge.target_id, edge.rel_type)

Score = similarity × hop_decay × importance × stickiness

Namespace / Entity / Attributes

Generic partitioning for multi-tenant, multi-domain use:

from feather_db import FilterBuilder, MarketingProfile

# Build metadata with domain profile
profile = feather_db.MarketingProfile()
profile.set_brand("nike")
profile.set_user("user_8821")
profile.set_channel("instagram")
profile.set_ctr(0.045)
meta = profile.to_metadata()

db.add(id=100, vec=vec, meta=meta)

# Filter by namespace + entity + attribute
f = FilterBuilder().namespace("nike").entity("user_8821").attribute("channel", "instagram").build()
results = db.search(query_vec, k=10, filter=f)

Works for any domain — healthcare, e-commerce, finance — by subclassing DomainProfile.

Living Context / Adaptive Decay

Records accessed more frequently resist temporal decay:

from feather_db import ScoringConfig

cfg = ScoringConfig(half_life=30.0, weight=0.3, min=0.0)
results = db.search(query_vec, k=10, scoring=cfg)

Formula:

stickiness    = 1 + log(1 + recall_count)
effective_age = age_in_days / stickiness
recency       = 0.5 ^ (effective_age / half_life_days)
final_score   = ((1 - time_weight) * similarity + time_weight * recency) * importance

touch() is called automatically on every search hit. Call db.touch(id) manually to boost salience.

Graph Visualization

Exports a self-contained, offline D3 force-graph HTML — no CDN, no server:

from feather_db.graph import visualize, export_graph

# Interactive HTML force graph
visualize(db, output_path="/tmp/graph.html")

# JSON for D3 / Cytoscape (namespace-filtered)
data = export_graph(db, namespace_filter="nike")

Import / Export

# D3 / Cytoscape-compatible JSON
json_str = db.export_graph_json(namespace_filter="nike", entity_filter="user_8821")

# Raw vector retrieval
vec   = db.get_vector(id=42, modality="text")
ids   = db.get_all_ids(modality="visual")

# Metadata update without touching HNSW index
db.update_metadata(id=42, meta=new_meta)
db.update_importance(id=42, importance=0.95)

Filtered Search

from feather_db import FilterBuilder

results = db.search(
    query_vec, k=10,
    filter=FilterBuilder()
        .namespace("nike")
        .entity("user_8821")
        .attribute("channel", "instagram")
        .source("pipeline-v1")
        .importance_gte(0.5)
        .build()
)

Metadata Fields

meta = feather_db.Metadata()
meta.timestamp      = int(time.time())    # Unix timestamp
meta.importance     = 0.9                 # [0.0–1.0]
meta.type           = feather_db.ContextType.FACT  # FACT | PREFERENCE | EVENT | CONVERSATION
meta.source         = "pipeline-v1"
meta.content        = "Human-readable content"
meta.tags_json      = '["tag1","tag2"]'
meta.namespace_id   = "nike"             # partition key
meta.entity_id      = "user_8821"        # subject key
meta.set_attribute("channel", "instagram")   # safe KV setter (use this, not meta.attributes['k']=v)
val = meta.get_attribute("channel")

Rust CLI

# Add a record
feather add --db my.feather --id 1 --vec "0.1,0.2,0.3" --modality text

# Search
feather search --db my.feather --vec "0.1,0.2,0.3" --k 5

# Link two records
feather link --db my.feather --from 1 --to 2

Performance

Metric Value
Add rate 2,000–5,000 vectors/sec
Search latency (k=10) 0.5–1.5 ms
Max vectors per modality 1,000,000 (configurable)
HNSW params M=16, ef_construction=200
File format Binary .feather v5

SIMD (AVX2/AVX512) optimizations are available in space_l2.h. Enable with -DUSE_AVX -march=native in setup.py.


File Format

[magic: 4B = "FEAT"] [version: 4B = 5]
--- Metadata Section ---
[meta_count: 4B]
  for each record:
    [id: 8B] [serialized Metadata including namespace/entity/attributes/edges]
--- Modality Indices Section ---
[modal_count: 4B]
  for each modality:
    [name_len: 2B] [name: N bytes]
    [dim: 4B] [element_count: 4B]
    for each element:
      [id: 8B] [float32 vector: dim * 4 bytes]

v3 and v4 files load transparently — missing fields default to empty.


Examples

File Description
examples/context_graph_demo.py Full context graph demo — auto-link, context_chain, D3 HTML export
examples/marketing_living_context.py Multi-brand namespace/entity/attribute filtering + importance feedback
examples/feather_inspector.py Local HTTP inspector — force graph, PCA scatter, edit, delete

Run any example:

python setup.py build_ext --inplace
python3 examples/context_graph_demo.py

Architecture

[Generic Core — C++17]
feather::DB
  ├── modality_indices_  (unordered_map<string, ModalityIndex>)  — one HNSW per modality
  ├── metadata_store_    (unordered_map<uint64_t, Metadata>)     — shared metadata by ID
  └── Methods: add, search, link, context_chain, auto_link, export_graph_json ...

[Python Layer]
feather_db (pybind11)
  ├── DB, Metadata, ContextType, ScoringConfig
  ├── Edge, IncomingEdge, ContextNode, ContextEdge, ContextChainResult
  ├── FilterBuilder       — fluent search filter helper
  ├── DomainProfile       — generic namespace/entity/attributes base class
  ├── MarketingProfile    — digital marketing typed adapter
  ├── RelType             — standard relationship type constants
  └── graph.visualize()   — D3 force-graph HTML exporter

[Rust CLI]
feather-db-cli (FFI via extern "C" from src/feather_core.cpp)

Known Limitations

Issue Detail
No concurrent writes HNSW is not thread-safe for simultaneous adds
No vector deletion HNSW marks deletions; data stays until compaction
Max 1M vectors/modality Hardcoded in get_or_create_index; increase max_elements to raise
meta.attributes['k'] = v silent no-op pybind11 map copy; use meta.set_attribute(k, v)
tags_json is raw string Tag filtering uses substring search, not proper JSON parsing

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes with tests
  4. Submit a pull request

See CONTRIBUTING.md for details.


License

MIT — see LICENSE


Acknowledgments

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

feather_db-0.6.1.tar.gz (1.1 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

feather_db-0.6.1-cp312-cp312-macosx_15_0_arm64.whl (1.1 MB view details)

Uploaded CPython 3.12macOS 15.0+ ARM64

File details

Details for the file feather_db-0.6.1.tar.gz.

File metadata

  • Download URL: feather_db-0.6.1.tar.gz
  • Upload date:
  • Size: 1.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.2

File hashes

Hashes for feather_db-0.6.1.tar.gz
Algorithm Hash digest
SHA256 0cf861bec6755d3816d22c1257d66e35060a1118a93f8ef96d662be7ce053930
MD5 5c8ae2c4b0784f44cba80d17392b63a6
BLAKE2b-256 bf7b9e87da6f4ece69cd13856c277c686c2ce4bed855308bfc520d2ecd25a05c

See more details on using hashes here.

File details

Details for the file feather_db-0.6.1-cp312-cp312-macosx_15_0_arm64.whl.

File metadata

File hashes

Hashes for feather_db-0.6.1-cp312-cp312-macosx_15_0_arm64.whl
Algorithm Hash digest
SHA256 22476cd61e833c938f7c1f7d5af119e24786ff63d1d51706c6271da642bbc9cc
MD5 c7df256f607fe0ba2832f52f7cb32148
BLAKE2b-256 8cc037d1513c9e5ae813872f6f441327a225c76a0feb49d55ba31004536b97c6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page