Skip to main content

SIE integration for ChromaDB

Project description

sie-chroma

SIE integration for ChromaDB.

Installation

pip install sie-chroma

Features

  • SIEEmbeddingFunction: Custom embedding function for ChromaDB collections

Quick Start

Basic Usage

import chromadb
from sie_chroma import SIEEmbeddingFunction

# Create SIE embedding function
embedding_function = SIEEmbeddingFunction(
    base_url="http://localhost:8080",
    model="BAAI/bge-m3",
)

# Create ChromaDB client and collection
client = chromadb.Client()
collection = client.create_collection(
    name="my_collection",
    embedding_function=embedding_function,
)

# Add documents (embeddings are generated automatically)
collection.add(
    documents=[
        "Machine learning enables pattern recognition.",
        "Deep learning uses neural networks.",
        "Natural language processing analyzes text.",
    ],
    ids=["doc1", "doc2", "doc3"],
)

# Query the collection
results = collection.query(
    query_texts=["What is deep learning?"],
    n_results=2,
)
print(results["documents"])

With Persistent Storage

import chromadb
from sie_chroma import SIEEmbeddingFunction

# Persistent client
client = chromadb.PersistentClient(path="./chroma_data")

embedding_function = SIEEmbeddingFunction(
    base_url="http://localhost:8080",
    model="BAAI/bge-m3",
)

# Get or create collection
collection = client.get_or_create_collection(
    name="research_papers",
    embedding_function=embedding_function,
)

# Add documents with metadata
collection.add(
    documents=["Paper about transformers...", "Study on attention mechanisms..."],
    metadatas=[{"year": 2023}, {"year": 2024}],
    ids=["paper1", "paper2"],
)

# Query with metadata filtering
results = collection.query(
    query_texts=["attention in neural networks"],
    n_results=5,
    where={"year": {"$gte": 2023}},
)

With LangChain or LlamaIndex

The SIEEmbeddingFunction works with ChromaDB's LangChain and LlamaIndex integrations:

# LangChain
from langchain_chroma import Chroma
from sie_chroma import SIEEmbeddingFunction

embedding_function = SIEEmbeddingFunction(model="BAAI/bge-m3")
vectorstore = Chroma(
    collection_name="docs",
    embedding_function=embedding_function,  # Works directly!
)

# LlamaIndex
from llama_index.vector_stores.chroma import ChromaVectorStore

# SIE can also be used via LlamaIndex's SIEEmbedding

SIE Server

Start the SIE server before using this integration:

mise run serve -d cpu -p 8080

Testing

# Unit tests (no server required)
pytest

# Integration tests (requires running server)
pytest -m integration

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sie_chroma-0.1.8.tar.gz (10.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sie_chroma-0.1.8-py3-none-any.whl (4.6 kB view details)

Uploaded Python 3

File details

Details for the file sie_chroma-0.1.8.tar.gz.

File metadata

  • Download URL: sie_chroma-0.1.8.tar.gz
  • Upload date:
  • Size: 10.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for sie_chroma-0.1.8.tar.gz
Algorithm Hash digest
SHA256 6a7775f337ea5df5c50301e8447aa6a183667b95ea314959d32397b20dc228cb
MD5 daca822da007f7463d7cd22d88a9515b
BLAKE2b-256 7907f849ccf9551b54bbbddc61e85104b68d36da9fc96c6d2eda1ee10ad53bde

See more details on using hashes here.

File details

Details for the file sie_chroma-0.1.8-py3-none-any.whl.

File metadata

  • Download URL: sie_chroma-0.1.8-py3-none-any.whl
  • Upload date:
  • Size: 4.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for sie_chroma-0.1.8-py3-none-any.whl
Algorithm Hash digest
SHA256 b386a98d17fcc94b870f50d1ed2aa74b5915135213c86961de6787d8da5c6426
MD5 a6cb82c7a467901482feb2392d4c2d9c
BLAKE2b-256 139d823dcd4874e9c2cfd30446bc0b7aa576cb9128de35d6048f3bdc1a88447f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page