Skip to main content

Remote vector database agent skills (Pinecone + Chroma + Weaviate) for Concinno — thin adapters for agent upsert/query. Concinno core's ZIQRetrieval + FTRL remains the full retrieval stack; this is the outbound-to-vendor slot.

Project description

concinno-skills-vector

Remote vector database skills for Concinno. Pinecone + Chroma + Weaviate via their official Python SDKs — thin call(**kwargs) adapters for agents that need to upsert/query a vendor-hosted vector store.

Status

0.2.0 — six tools covering the canonical "talk to a vector DB" agent need. Concinno core already ships ZIQRetrieval — a full BM25 + dense + FTRL online-learning retrieval engine for local knowledge; this sub-package is the orthogonal "read/write external vendor index" slot.

New in 0.2.0: WeaviateQuery accepts MongoDB-style dict filters ($eq / $ne / $gt / $gte / $lt / $lte / $in / $and / $or) and translates them to Weaviate v4's typed Filter DSL server-side:

WeaviateQuery().call(
    action="query",
    collection="Articles",
    vector=[...],
    filter={"category": "ml", "year": {"$gte": 2024}},
    # → Filter.all_of([by("category").equal("ml"),
    #                  by("year").greater_or_equal(2024)])
    weaviate_url="https://...",
    weaviate_api_key="wk-...",
)
Tool Library License Purpose
PineconeUpsert pinecone (v5+) Apache-2.0 Upsert vectors into a Pinecone index
PineconeQuery pinecone (v5+) Apache-2.0 Query a Pinecone index by vector
ChromaAdd chromadb (v0.5+) Apache-2.0 Add vectors to a Chroma collection (local or remote)
ChromaQuery chromadb (v0.5+) Apache-2.0 Query a Chroma collection by vector
WeaviateAdd weaviate-client (v4.9+) BSD-3 Insert vectors into a Weaviate v4 collection
WeaviateQuery weaviate-client (v4.9+) BSD-3 Query a Weaviate v4 collection by vector

Install

pip install concinno-skills-vector

All three vendor SDKs come in as hard dependencies. Consumers who only need one vendor path can install with --no-deps and pick the SDK they want.

Scope vs Concinno core

Concinno core's concinno.rag.ZIQRetrieval is a full retrieval stack — BM25 + dense embedding + FTRL online-learning routing between the two arms, with persistence, reranking, and per-namespace SPS (Structural Prior SOTA) tuning. It targets the "I want an intelligent retriever" slot.

This sub-package is an outbound adapter — the "my agent already uses Pinecone / Chroma / Weaviate as its vector store, let the agent upsert / query it directly" slot. No embedding model, no scoring opinion, no router.

The two can coexist in the same ToolRegistry; they do not overlap.

Why this sub-package vs LangChain retrievers

LangChain retrievers are readers only (no online learning, per-query routing is pluggable but not built-in adaptive). Concinno's own ZIQRetrieval in core has the FTRL online-learning router; this sub-package just provides clean agent-callable access to the three most common vendor stores without reinventing the router.

Safety

Every tool routes through a shared _safety module before touching the vendor:

  1. top_k capquery defaults to 10 and hard-caps at 1000. Larger pagination must be done caller-side.
  2. Batch capupsert / add / insert caps at 10_000 rows per call. Split larger jobs into multiple calls.
  3. Vector dimension consistency — the first row of a batch defines the dimension; all subsequent rows must match. Mixed-dim batches corrupt vendor indexes with no rollback.
  4. Filter type — filters must be JSON-serialisable dict (or None). String / script / callable filter bodies are rejected at the tool layer. WeaviateQuery now translates MongoDB-style dict filters to Weaviate's typed Filter DSL ($eq / $ne / $gt / $gte / $lt / $lte / $in / $and / $or). Unsupported operators ($not / $nin / $regex, etc.) surface as {"error": "..."} rather than being silently dropped.

Credentials

No credentials are stored. Each call takes the vendor-specific credential kwargs (api_key, weaviate_url + weaviate_api_key, chroma_host + chroma_port or chroma_persist_dir). Callers who want indirection resolve upstream:

import os
from concinno_skills_vector import PineconeQuery

PineconeQuery().call(
    action="query",
    api_key=os.environ["PINECONE_API_KEY"],
    index="my-index",
    vector=[0.1] * 1536,
    top_k=10,
)

Or via Concinno's CredentialStore:

from concinno.core.credentials import CredentialStore
cs = CredentialStore()
key = cs.resolve({"$ref": "env:PINECONE_API_KEY"})

Usage via Concinno ToolRegistry

When the consumer sets CONCINNO_LOAD_PLUGINS=1, the default registry auto-mounts all six tools:

import os
os.environ["CONCINNO_LOAD_PLUGINS"] = "1"

from concinno.tools.registry import get_default_registry
reg = get_default_registry()
expected = {
    "PineconeUpsert", "PineconeQuery",
    "ChromaAdd", "ChromaQuery",
    "WeaviateAdd", "WeaviateQuery",
}
assert expected.issubset(set(reg.list_deferred()))

Direct Python usage

from concinno_skills_vector import (
    PineconeUpsert, PineconeQuery,
    ChromaAdd, ChromaQuery,
    WeaviateAdd, WeaviateQuery,
)

# ── Pinecone ──────────────────────────────────────────────────────
PineconeUpsert().call(
    action="upsert",
    api_key="pk-...",
    index="products",
    vectors=[[0.1, 0.2, ...], [0.3, 0.4, ...]],
    ids=["sku-1", "sku-2"],
    metadata=[{"category": "book"}, {"category": "book"}],
    namespace="prod",
)
# → {"ok": True, "upserted": 2}

PineconeQuery().call(
    action="query",
    api_key="pk-...",
    index="products",
    vector=[0.1, 0.2, ...],
    top_k=5,
    filter={"category": {"$eq": "book"}},
)
# → {"matches": [{"id": "sku-1", "score": 0.95, "metadata": {...}}, ...]}

# ── Chroma ────────────────────────────────────────────────────────
ChromaAdd().call(
    action="add",
    collection="docs",
    vectors=[[...]],
    ids=["doc-1"],
    metadata=[{"source": "manual"}],
    chroma_persist_dir="~/.myapp/chroma",
)
# → {"ok": True, "added": 1}

ChromaQuery().call(
    action="query",
    collection="docs",
    vector=[...],
    top_k=3,
    chroma_host="localhost",
    chroma_port=8000,
)
# → {"matches": [{"id": "doc-1", "score": 0.12, "metadata": {...}}]}

# ── Weaviate ──────────────────────────────────────────────────────
WeaviateAdd().call(
    action="add",
    collection="Article",
    vectors=[[...], [...]],
    ids=["550e8400-e29b-41d4-a716-446655440000", "..."],
    metadata=[{"title": "..."}, {"title": "..."}],
    weaviate_url="https://my-cluster.weaviate.network",
    weaviate_api_key="wk-...",
)
# → {"ok": True, "added": 2, "failed": 0}

WeaviateQuery().call(
    action="query",
    collection="Article",
    vector=[...],
    top_k=10,
    filter={"category": "ml", "year": {"$gte": 2024}},  # NEW in 0.2.0
    weaviate_url="https://my-cluster.weaviate.network",
    weaviate_api_key="wk-...",
)
# → {"matches": [{"id": "550e8400-...", "score": 0.08, "metadata": {...}}]}

All tools return either {"ok": True, ...} / {"matches": [...]} on success or {"error": "..."} on any validation or vendor-SDK failure — same shape as other Concinno built-in tools.

What this package is NOT

  • Not a retrieval engine. Use concinno.rag.ZIQRetrieval for BM25 + dense + FTRL routing.
  • Not a migration tool. Index creation / schema changes go through each vendor's admin API, not these Tool call methods.
  • Not an embedding service. Callers supply pre-computed vectors. Embedding models live in their own tools (e.g. Concinno core's ZIQRetrieval owns sentence-transformers).
  • Not a secrets store. Credentials are per-call.
  • Not a full Weaviate Filter DSL translator. WeaviateQuery covers the MongoDB-style operator subset listed above ($eq/$ne/$gt/$gte/$lt/$lte/$in/$and/$or). Weaviate-specific operators without a MongoDB analogue (contains_all, like, within_geo_range, cross-reference filters) require calling the Weaviate SDK directly.

License

Apache-2.0. See LICENSE in the Concinno monorepo.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

concinno_skills_vector-0.2.0.tar.gz (55.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

concinno_skills_vector-0.2.0-py3-none-any.whl (25.2 kB view details)

Uploaded Python 3

File details

Details for the file concinno_skills_vector-0.2.0.tar.gz.

File metadata

  • Download URL: concinno_skills_vector-0.2.0.tar.gz
  • Upload date:
  • Size: 55.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for concinno_skills_vector-0.2.0.tar.gz
Algorithm Hash digest
SHA256 3f6ccdfc8b887a491d83f5a597a5ae14a6c08ddb7652a8a33c0bb1cd40a9b030
MD5 d57cf45534608c000ba478cbf8225e44
BLAKE2b-256 879cf3fd7c054a468670f5aba609b64429232243a0db7a69d4ccd53fe211b33f

See more details on using hashes here.

File details

Details for the file concinno_skills_vector-0.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for concinno_skills_vector-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 15bf9e54420d0ab6f5a5a5162b37cb5718d0c89da7b6b52228bc2a61519f1e37
MD5 54c495126e11ca6c09690dc453703c6b
BLAKE2b-256 0f9f696f0a38e5b8571bcda1c8c7cde935b4478cf6eaa98fbc875640dba2ccfd

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page