Skip to main content

SIE integration for ChromaDB

Project description

sie-chroma

SIE integration for ChromaDB.

Installation

pip install sie-chroma

Features

  • SIEEmbeddingFunction: Custom embedding function for ChromaDB collections

Quick Start

Basic Usage

import chromadb
from sie_chroma import SIEEmbeddingFunction

# Create SIE embedding function
embedding_function = SIEEmbeddingFunction(
    base_url="http://localhost:8080",
    model="BAAI/bge-m3",
)

# Create ChromaDB client and collection
client = chromadb.Client()
collection = client.create_collection(
    name="my_collection",
    embedding_function=embedding_function,
)

# Add documents (embeddings are generated automatically)
collection.add(
    documents=[
        "Machine learning enables pattern recognition.",
        "Deep learning uses neural networks.",
        "Natural language processing analyzes text.",
    ],
    ids=["doc1", "doc2", "doc3"],
)

# Query the collection
results = collection.query(
    query_texts=["What is deep learning?"],
    n_results=2,
)
print(results["documents"])

With Persistent Storage

import chromadb
from sie_chroma import SIEEmbeddingFunction

# Persistent client
client = chromadb.PersistentClient(path="./chroma_data")

embedding_function = SIEEmbeddingFunction(
    base_url="http://localhost:8080",
    model="BAAI/bge-m3",
)

# Get or create collection
collection = client.get_or_create_collection(
    name="research_papers",
    embedding_function=embedding_function,
)

# Add documents with metadata
collection.add(
    documents=["Paper about transformers...", "Study on attention mechanisms..."],
    metadatas=[{"year": 2023}, {"year": 2024}],
    ids=["paper1", "paper2"],
)

# Query with metadata filtering
results = collection.query(
    query_texts=["attention in neural networks"],
    n_results=5,
    where={"year": {"$gte": 2023}},
)

With LangChain or LlamaIndex

The SIEEmbeddingFunction works with ChromaDB's LangChain and LlamaIndex integrations:

# LangChain
from langchain_chroma import Chroma
from sie_chroma import SIEEmbeddingFunction

embedding_function = SIEEmbeddingFunction(model="BAAI/bge-m3")
vectorstore = Chroma(
    collection_name="docs",
    embedding_function=embedding_function,  # Works directly!
)

# LlamaIndex
from llama_index.vector_stores.chroma import ChromaVectorStore

# SIE can also be used via LlamaIndex's SIEEmbedding

SIE Server

Start the SIE server before using this integration:

mise run serve -d cpu -p 8080

Testing

# Unit tests (no server required)
pytest

# Integration tests (requires running server)
pytest -m integration

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sie_chroma-0.3.3.tar.gz (10.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sie_chroma-0.3.3-py3-none-any.whl (4.2 kB view details)

Uploaded Python 3

File details

Details for the file sie_chroma-0.3.3.tar.gz.

File metadata

  • Download URL: sie_chroma-0.3.3.tar.gz
  • Upload date:
  • Size: 10.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for sie_chroma-0.3.3.tar.gz
Algorithm Hash digest
SHA256 ed66abb3e3c73fbc9951a8b7561e96ce3c3d3793c9ce6689a5c32c5a399a43aa
MD5 4f7a796486a71eed40b9bba784c512fa
BLAKE2b-256 7551cfa11f79b520ba42bbd663acf78f215178c68accc209c3c7c916be6f94c7

See more details on using hashes here.

Provenance

The following attestation bundles were made for sie_chroma-0.3.3.tar.gz:

Publisher: release-python.yml on superlinked/sie-internal

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file sie_chroma-0.3.3-py3-none-any.whl.

File metadata

  • Download URL: sie_chroma-0.3.3-py3-none-any.whl
  • Upload date:
  • Size: 4.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for sie_chroma-0.3.3-py3-none-any.whl
Algorithm Hash digest
SHA256 8915370a91c6d252d8239d465adacb27dc3faeecd09b58f3e8a6cfba72fa2086
MD5 d9a0707ad91bf13e8e6815948d4c2d50
BLAKE2b-256 3c47a0c7a88a57b89097b277c89d01d69bee8c75461069a7355492cd68ab8386

See more details on using hashes here.

Provenance

The following attestation bundles were made for sie_chroma-0.3.3-py3-none-any.whl:

Publisher: release-python.yml on superlinked/sie-internal

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page