Skip to main content

SIE integration for ChromaDB

Project description

sie-chroma

SIE integration for ChromaDB.

Installation

pip install sie-chroma

Features

  • SIEEmbeddingFunction: Custom embedding function for ChromaDB collections

Quick Start

Basic Usage

import chromadb
from sie_chroma import SIEEmbeddingFunction

# Create SIE embedding function
embedding_function = SIEEmbeddingFunction(
    base_url="http://localhost:8080",
    model="BAAI/bge-m3",
)

# Create ChromaDB client and collection
client = chromadb.Client()
collection = client.create_collection(
    name="my_collection",
    embedding_function=embedding_function,
)

# Add documents (embeddings are generated automatically)
collection.add(
    documents=[
        "Machine learning enables pattern recognition.",
        "Deep learning uses neural networks.",
        "Natural language processing analyzes text.",
    ],
    ids=["doc1", "doc2", "doc3"],
)

# Query the collection
results = collection.query(
    query_texts=["What is deep learning?"],
    n_results=2,
)
print(results["documents"])

With Persistent Storage

import chromadb
from sie_chroma import SIEEmbeddingFunction

# Persistent client
client = chromadb.PersistentClient(path="./chroma_data")

embedding_function = SIEEmbeddingFunction(
    base_url="http://localhost:8080",
    model="BAAI/bge-m3",
)

# Get or create collection
collection = client.get_or_create_collection(
    name="research_papers",
    embedding_function=embedding_function,
)

# Add documents with metadata
collection.add(
    documents=["Paper about transformers...", "Study on attention mechanisms..."],
    metadatas=[{"year": 2023}, {"year": 2024}],
    ids=["paper1", "paper2"],
)

# Query with metadata filtering
results = collection.query(
    query_texts=["attention in neural networks"],
    n_results=5,
    where={"year": {"$gte": 2023}},
)

With LangChain or LlamaIndex

The SIEEmbeddingFunction works with ChromaDB's LangChain and LlamaIndex integrations:

# LangChain
from langchain_chroma import Chroma
from sie_chroma import SIEEmbeddingFunction

embedding_function = SIEEmbeddingFunction(model="BAAI/bge-m3")
vectorstore = Chroma(
    collection_name="docs",
    embedding_function=embedding_function,  # Works directly!
)

# LlamaIndex
from llama_index.vector_stores.chroma import ChromaVectorStore

# SIE can also be used via LlamaIndex's SIEEmbedding

SIE Server

Start the SIE server before using this integration:

mise run serve -d cpu -p 8080

Testing

# Unit tests (no server required)
pytest

# Integration tests (requires running server)
pytest -m integration

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sie_chroma-0.4.2.tar.gz (10.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sie_chroma-0.4.2-py3-none-any.whl (4.2 kB view details)

Uploaded Python 3

File details

Details for the file sie_chroma-0.4.2.tar.gz.

File metadata

  • Download URL: sie_chroma-0.4.2.tar.gz
  • Upload date:
  • Size: 10.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for sie_chroma-0.4.2.tar.gz
Algorithm Hash digest
SHA256 c1165240ac01670e7e0d0c9190b518db61627a67df8087f9ab6d360b7345ae98
MD5 784f428fe447b9298cd98f77dd4a3368
BLAKE2b-256 57a4e1583ff5967ece3199704fe7d2cd094cc1859102cc67f5827b784ecf3df5

See more details on using hashes here.

Provenance

The following attestation bundles were made for sie_chroma-0.4.2.tar.gz:

Publisher: release-python.yml on superlinked/sie-internal

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file sie_chroma-0.4.2-py3-none-any.whl.

File metadata

  • Download URL: sie_chroma-0.4.2-py3-none-any.whl
  • Upload date:
  • Size: 4.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for sie_chroma-0.4.2-py3-none-any.whl
Algorithm Hash digest
SHA256 243d051a6ae278554130a6e4c0ef2e7d1ae520351f5dcd99da2d1cf12cffd9db
MD5 1324832969609c59d4180649fe2fac5f
BLAKE2b-256 94a0c7df45490f5a39a2fe4e5e5a5a68f7787f2e788f6c6c7e1b4d984ae6ac02

See more details on using hashes here.

Provenance

The following attestation bundles were made for sie_chroma-0.4.2-py3-none-any.whl:

Publisher: release-python.yml on superlinked/sie-internal

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page