Skip to main content

SIE integration for ChromaDB

Project description

sie-chroma

SIE integration for ChromaDB.

Installation

pip install sie-chroma

Features

  • SIEEmbeddingFunction: Custom embedding function for ChromaDB collections

Quick Start

Basic Usage

import chromadb
from sie_chroma import SIEEmbeddingFunction

# Create SIE embedding function
embedding_function = SIEEmbeddingFunction(
    base_url="http://localhost:8080",
    model="BAAI/bge-m3",
)

# Create ChromaDB client and collection
client = chromadb.Client()
collection = client.create_collection(
    name="my_collection",
    embedding_function=embedding_function,
)

# Add documents (embeddings are generated automatically)
collection.add(
    documents=[
        "Machine learning enables pattern recognition.",
        "Deep learning uses neural networks.",
        "Natural language processing analyzes text.",
    ],
    ids=["doc1", "doc2", "doc3"],
)

# Query the collection
results = collection.query(
    query_texts=["What is deep learning?"],
    n_results=2,
)
print(results["documents"])

With Persistent Storage

import chromadb
from sie_chroma import SIEEmbeddingFunction

# Persistent client
client = chromadb.PersistentClient(path="./chroma_data")

embedding_function = SIEEmbeddingFunction(
    base_url="http://localhost:8080",
    model="BAAI/bge-m3",
)

# Get or create collection
collection = client.get_or_create_collection(
    name="research_papers",
    embedding_function=embedding_function,
)

# Add documents with metadata
collection.add(
    documents=["Paper about transformers...", "Study on attention mechanisms..."],
    metadatas=[{"year": 2023}, {"year": 2024}],
    ids=["paper1", "paper2"],
)

# Query with metadata filtering
results = collection.query(
    query_texts=["attention in neural networks"],
    n_results=5,
    where={"year": {"$gte": 2023}},
)

With LangChain or LlamaIndex

The SIEEmbeddingFunction works with ChromaDB's LangChain and LlamaIndex integrations:

# LangChain
from langchain_chroma import Chroma
from sie_chroma import SIEEmbeddingFunction

embedding_function = SIEEmbeddingFunction(model="BAAI/bge-m3")
vectorstore = Chroma(
    collection_name="docs",
    embedding_function=embedding_function,  # Works directly!
)

# LlamaIndex
from llama_index.vector_stores.chroma import ChromaVectorStore

# SIE can also be used via LlamaIndex's SIEEmbedding

SIE Server

Start the SIE server before using this integration:

mise run serve -d cpu -p 8080

Testing

# Unit tests (no server required)
pytest

# Integration tests (requires running server)
pytest -m integration

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sie_chroma-0.3.0.tar.gz (10.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sie_chroma-0.3.0-py3-none-any.whl (4.2 kB view details)

Uploaded Python 3

File details

Details for the file sie_chroma-0.3.0.tar.gz.

File metadata

  • Download URL: sie_chroma-0.3.0.tar.gz
  • Upload date:
  • Size: 10.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for sie_chroma-0.3.0.tar.gz
Algorithm Hash digest
SHA256 1a2fa80f728c84b9803f0495c1ad2ac3133a5d1d197707ad941f65da5e8f0fd1
MD5 f39f3d10573a2077216e3c116c894dca
BLAKE2b-256 07c0be119c6cf89609ebc445747a6a2e7d578ff0e26298f930bc7c4ec2654fea

See more details on using hashes here.

Provenance

The following attestation bundles were made for sie_chroma-0.3.0.tar.gz:

Publisher: release-python.yml on superlinked/sie-internal

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file sie_chroma-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: sie_chroma-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 4.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for sie_chroma-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 bec4c492335fd7dbb6c5eaed67e333e3a0fe60dce8010df1568e2387314732cf
MD5 66fe56893702c937a52951d2133a2048
BLAKE2b-256 3922dcfe7ece6ce06621205cfc88224b769ec4b733eb55205d5724ad8bca5b34

See more details on using hashes here.

Provenance

The following attestation bundles were made for sie_chroma-0.3.0-py3-none-any.whl:

Publisher: release-python.yml on superlinked/sie-internal

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page