Skip to main content

OpenTelemetry instrumentation for ChromaDB vector database

Project description

traceAI-chromadb

OpenTelemetry instrumentation for ChromaDB vector database.

Installation

pip install traceAI-chromadb

Quick Start

import os
from fi_instrumentation import register
from fi_instrumentation.fi_types import ProjectType
from traceai_chromadb import ChromaDBInstrumentor
import chromadb

# Set up environment
os.environ["FI_API_KEY"] = "<your-api-key>"
os.environ["FI_SECRET_KEY"] = "<your-secret-key>"

# Register tracer
trace_provider = register(
    project_type=ProjectType.OBSERVE,
    project_name="my-rag-app"
)

# Instrument ChromaDB
ChromaDBInstrumentor().instrument(tracer_provider=trace_provider)

# Use ChromaDB as normal - all operations are traced!
client = chromadb.Client()
collection = client.create_collection("my-collection")

# Add documents
collection.add(
    ids=["id1", "id2"],
    documents=["Hello world", "Goodbye world"],
    metadatas=[{"source": "doc1"}, {"source": "doc2"}]
)

# Query
results = collection.query(
    query_texts=["Hello"],
    n_results=5
)

Instrumented Operations

Operation Span Name Description
add chroma add Add documents/embeddings
query chroma query Semantic search
get chroma get Retrieve by ID or filter
update chroma update Update existing documents
upsert chroma upsert Insert or update
delete chroma delete Delete documents
count chroma count Get collection count
peek chroma peek Preview collection

Span Attributes

Common Attributes

Attribute Type Description
db.system string Always "chroma"
db.operation.name string Operation name
db.namespace string Collection name
db.vector.collection.name string Collection name

Query Attributes

Attribute Type Description
db.vector.query.top_k int n_results parameter
db.vector.query.filter string JSON where clause
db.vector.query.where_document string Document filter
db.vector.query.include string Included fields
db.vector.query.type string "embedding" or "text"
db.vector.results.count int Number of results
db.vector.results.ids string JSON result IDs
db.vector.results.scores string JSON distances

Add/Upsert Attributes

Attribute Type Description
db.vector.upsert.count int Number of items
db.vector.upsert.dimensions int Embedding dimensions
db.vector.documents.count int Number of documents

Examples

Persistent Client

from fi_instrumentation import register
from traceai_chromadb import ChromaDBInstrumentor
import chromadb

# Register and instrument
trace_provider = register(project_name="chroma-persistent")
ChromaDBInstrumentor().instrument(tracer_provider=trace_provider)

# Use persistent client
client = chromadb.PersistentClient(path="/path/to/db")
collection = client.get_or_create_collection("documents")

# All operations are traced
collection.add(
    ids=["doc1"],
    documents=["Important document content"],
    embeddings=[[0.1] * 384]  # Your embeddings
)

With Embedding Function

from chromadb.utils import embedding_functions

# Use with OpenAI embeddings
openai_ef = embedding_functions.OpenAIEmbeddingFunction(
    api_key="sk-...",
    model_name="text-embedding-3-small"
)

collection = client.create_collection(
    name="openai-docs",
    embedding_function=openai_ef
)

# Add documents - embeddings generated automatically
collection.add(
    ids=["id1", "id2"],
    documents=["Document one", "Document two"]
)

# Query with text - embedding generated automatically
results = collection.query(
    query_texts=["search query"],
    n_results=5
)

RAG with LangChain

from fi_instrumentation import register
from traceai_chromadb import ChromaDBInstrumentor
from traceai_langchain import LangChainInstrumentor

# Instrument both
trace_provider = register(project_name="rag-langchain")
ChromaDBInstrumentor().instrument(tracer_provider=trace_provider)
LangChainInstrumentor().instrument(tracer_provider=trace_provider)

# Use LangChain with Chroma
from langchain_chroma import Chroma
from langchain_openai import OpenAIEmbeddings

vectorstore = Chroma(
    collection_name="langchain-docs",
    embedding_function=OpenAIEmbeddings()
)

# Add documents
vectorstore.add_texts(
    texts=["Document 1", "Document 2"],
    metadatas=[{"source": "web"}, {"source": "pdf"}]
)

# Search - all operations traced
docs = vectorstore.similarity_search("query", k=5)

Testing

ChromaDB runs in-memory, so no external database is needed for testing.

Unit Tests

cd python/frameworks/chromadb
pip install pytest chromadb
pytest tests/ -v

E2E Integration Tests

cd python/frameworks/tests_e2e
pytest test_chromadb_e2e.py -v

Example

Run the semantic search example:

cd examples
python semantic_search.py

Resources

License

Apache License 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

traceai_chromadb-0.1.0.tar.gz (7.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

traceai_chromadb-0.1.0-py3-none-any.whl (8.1 kB view details)

Uploaded Python 3

File details

Details for the file traceai_chromadb-0.1.0.tar.gz.

File metadata

  • Download URL: traceai_chromadb-0.1.0.tar.gz
  • Upload date:
  • Size: 7.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.21 {"installer":{"name":"uv","version":"0.9.21","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for traceai_chromadb-0.1.0.tar.gz
Algorithm Hash digest
SHA256 df45b9d5604c6f1e91e8e2ed96464b771937077f80831bb043a7ce04c6ae35dd
MD5 0c5db3dafd97605bfa9cc32ce867b287
BLAKE2b-256 56ed72cd50e7701cd7f8a77b2d859551aacad0aaf33da9a14e363dd459257534

See more details on using hashes here.

File details

Details for the file traceai_chromadb-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: traceai_chromadb-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 8.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.21 {"installer":{"name":"uv","version":"0.9.21","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for traceai_chromadb-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a49db2ac34fdb93acab00b39a0d806c00649ccce0e8982dec593c9c383604396
MD5 5c2a555b176abc4a133cf40326e16397
BLAKE2b-256 39cad285dfdbcb40c81863e39526dcab51398a536b166fc9ca1041388fa82f5a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page