Skip to main content

Decentralized vector database for AI agents. Rust HNSW + Solana on-chain provenance.

Project description

solvec

Python SDK for VecLabs — decentralized vector memory for AI agents.

Rust HNSW search engine. Solana on-chain Merkle proofs. Pinecone-compatible API.

PyPI version Python License: MIT Tests

pip install solvec

What this is

A vector database SDK that stores your embeddings on decentralized storage, posts a cryptographic Merkle root to Solana after every write, and queries them through a Rust HNSW engine at sub-5ms p99.

If you are currently using Pinecone, the API is intentionally identical. Migration is three line changes.


Quick start

from solvec import SolVec

sv = SolVec(network="devnet")
collection = sv.collection("agent-memory", dimensions=768)

# Store vectors
collection.upsert([
    {
        "id": "mem_001",
        "values": [...],  # your embedding — any dimension
        "metadata": {"text": "User is Alex, building a fintech startup"}
    }
])

# Search by similarity
results = collection.query(vector=[...], top_k=5)

for match in results.matches:
    print(match.id, match.score, match.metadata)

# Verify collection integrity against on-chain Merkle root
proof = collection.verify()
print(proof.solana_explorer_url)

Migrating from Pinecone

# Before
from pinecone import Pinecone
pc = Pinecone(api_key="YOUR_KEY")
index = pc.Index("my-index")

# After — change 3 lines
from solvec import SolVec
sv = SolVec(wallet="~/.config/solana/id.json")
index = sv.collection("my-index")

# Everything below is identical
index.upsert([{"id": "vec_001", "values": [...], "metadata": {}}])
results = index.query(vector=[...], top_k=10)

# New — Pinecone has no equivalent
proof = index.verify()
print(proof.solana_explorer_url)

API Reference

SolVec(network, wallet, rpc_url)

Creates a new SolVec client.

sv = SolVec(
    network="devnet",                          # "mainnet-beta" | "devnet" | "localnet"
    wallet="~/.config/solana/id.json",         # optional — required for on-chain writes
    rpc_url="https://...",                     # optional — custom RPC endpoint
)

sv.collection(name, dimensions, metric)

Returns a SolVecCollection instance. Equivalent to Pinecone's Index().

collection = sv.collection(
    "my-collection",
    dimensions=768,      # default: 1536
    metric="cosine",     # "cosine" | "euclidean" | "dot" — default: "cosine"
)

collection.upsert(vectors)

Insert or update vectors. If a record with the same id already exists, it is overwritten.

Accepts either a list of dicts or a list of UpsertRecord dataclasses.

from solvec import UpsertRecord

# Dict style (matches Pinecone exactly)
collection.upsert([
    {
        "id": "vec_001",
        "values": [0.1, 0.2, ...],
        "metadata": {
            "text": "source text",
            "timestamp": 1709123456,
            "category": "memory"
        }
    }
])

# Dataclass style
collection.upsert([
    UpsertRecord(
        id="vec_001",
        values=[0.1, 0.2, ...],
        metadata={"text": "source text"}
    )
])

# Returns: UpsertResponse(upserted_count=1)

collection.query(vector, top_k, filter, include_metadata, include_values)

Search for nearest neighbors by vector similarity.

results = collection.query(
    vector=[0.1, 0.2, ...],      # required — query embedding
    top_k=10,                    # required — number of results
    filter={"category": "memory"},  # optional — metadata filter
    include_metadata=True,       # optional — default: True
    include_values=False,        # optional — default: False
)

# results.matches is a list of QueryMatch, sorted by score descending
for match in results.matches:
    print(match.id, match.score, match.metadata)

collection.delete(ids)

Delete vectors by ID.

collection.delete(["vec_001", "vec_002"])

collection.fetch(ids)

Fetch specific vectors by ID.

result = collection.fetch(["vec_001"])
print(result["vectors"]["vec_001"]["values"])

collection.describe_index_stats()

Get collection statistics.

stats = collection.describe_index_stats()
# CollectionStats(
#   vector_count=1000,
#   dimension=768,
#   metric=<DistanceMetric.COSINE: 'cosine'>,
#   name='my-collection',
#   merkle_root='a3f9b2...',
#   last_updated=1709123456,
#   is_frozen=False
# )

collection.verify()

Verify collection integrity against the on-chain Merkle root.

proof = collection.verify()
# VerificationResult(
#   verified=True,
#   on_chain_root='a3f9b2...',
#   local_root='a3f9b2...',
#   match=True,
#   vector_count=1000,
#   solana_explorer_url='https://explorer.solana.com/...',
#   timestamp=1709123456000
# )

Integration examples

LangChain

from solvec import SolVec
from langchain_openai import OpenAIEmbeddings

sv = SolVec(network="mainnet-beta")
collection = sv.collection("langchain-docs", dimensions=1536)

embeddings = OpenAIEmbeddings()

# Store document embeddings
texts = ["VecLabs is a decentralized vector DB", "Built on Solana", "Rust HNSW core"]
vectors = embeddings.embed_documents(texts)

collection.upsert([
    {"id": f"doc_{i}", "values": v, "metadata": {"text": t}}
    for i, (v, t) in enumerate(zip(vectors, texts))
])

# Query
query_vector = embeddings.embed_query("What is VecLabs?")
results = collection.query(vector=query_vector, top_k=3)

for match in results.matches:
    print(f"{match.score:.3f}{match.metadata['text']}")

AI agent persistent memory

from solvec import SolVec

sv = SolVec(network="mainnet-beta", wallet="~/.config/solana/id.json")
memory = sv.collection("agent-memory", dimensions=768)

def remember(text: str, embedding: list[float]) -> None:
    memory.upsert([{
        "id": f"mem_{int(time.time() * 1000)}",
        "values": embedding,
        "metadata": {"text": text, "timestamp": int(time.time())}
    }])

def recall(query_embedding: list[float], limit: int = 5) -> list[str]:
    results = memory.query(vector=query_embedding, top_k=limit)
    return [m.metadata.get("text", "") for m in results.matches]

def audit() -> None:
    proof = memory.verify()
    print(f"Memory verified: {proof.match}")
    print(f"On-chain proof: {proof.solana_explorer_url}")

Metadata filtering

collection.upsert([
    {"id": "a", "values": [...], "metadata": {"type": "fact", "source": "user"}},
    {"id": "b", "values": [...], "metadata": {"type": "memory", "source": "agent"}},
])

# Only return facts
results = collection.query(
    vector=[...],
    top_k=5,
    filter={"type": "fact"}
)

Using dataclasses

from solvec.types import DistanceMetric, UpsertRecord

sv = SolVec(network="devnet")
collection = sv.collection("typed-collection", dimensions=3, metric=DistanceMetric.EUCLIDEAN)

collection.upsert([
    UpsertRecord(id="a", values=[1.0, 0.0, 0.0], metadata={"label": "x-axis"}),
    UpsertRecord(id="b", values=[0.0, 1.0, 0.0], metadata={"label": "y-axis"}),
])

results = collection.query(vector=[1.0, 0.1, 0.0], top_k=1)
assert results.matches[0].id == "a"

Current status

This is alpha software. The API surface is stable.

Feature Status
upsert / query / delete / fetch Working
Cosine, euclidean, dot product Working
Metadata filtering Working
Merkle root computation Working
verify() Working (local computation)
Solana on-chain Merkle updates In progress
Shadow Drive persistence In progress — in-memory for now

Vectors are currently stored in-memory. Persistent decentralized storage via Shadow Drive ships in v0.2.0.


Requirements

  • Python 3.10+
  • No Solana wallet required for basic upsert/query
  • Solana wallet required for on-chain verify() (coming in v0.2.0)

Links


License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

solvec-0.1.0a4.tar.gz (7.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

solvec-0.1.0a4-py3-none-any.whl (8.9 kB view details)

Uploaded Python 3

File details

Details for the file solvec-0.1.0a4.tar.gz.

File metadata

  • Download URL: solvec-0.1.0a4.tar.gz
  • Upload date:
  • Size: 7.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for solvec-0.1.0a4.tar.gz
Algorithm Hash digest
SHA256 bbfa5a07e82449863f3b1181043a3446ee4a31b03cf4d9e70c76fcb88673385e
MD5 69239d51c5279df370fa0799a2210216
BLAKE2b-256 8d7b6b69bcae69f0b51ba5579fe34af5b6fe781ed2a40483ad095d9704b0bb63

See more details on using hashes here.

File details

Details for the file solvec-0.1.0a4-py3-none-any.whl.

File metadata

  • Download URL: solvec-0.1.0a4-py3-none-any.whl
  • Upload date:
  • Size: 8.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for solvec-0.1.0a4-py3-none-any.whl
Algorithm Hash digest
SHA256 ca36f6faae6f0eeef4c2ce6ceeba31e32848c832d7a6250658e139e1d8a4db74
MD5 a5a3ec51808bc2b96ded792fd8477278
BLAKE2b-256 dd9dda83a9aa9fe55ff9a4de2989d4f7180384df09f7f8245e23ec60bfe2e652

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page