SIE integration for ChromaDB
Project description
sie-chroma
SIE integration for ChromaDB.
Installation
pip install sie-chroma
Features
- SIEEmbeddingFunction: Custom embedding function for ChromaDB collections
Quick Start
Basic Usage
import chromadb
from sie_chroma import SIEEmbeddingFunction
# Create SIE embedding function
embedding_function = SIEEmbeddingFunction(
base_url="http://localhost:8080",
model="BAAI/bge-m3",
)
# Create ChromaDB client and collection
client = chromadb.Client()
collection = client.create_collection(
name="my_collection",
embedding_function=embedding_function,
)
# Add documents (embeddings are generated automatically)
collection.add(
documents=[
"Machine learning enables pattern recognition.",
"Deep learning uses neural networks.",
"Natural language processing analyzes text.",
],
ids=["doc1", "doc2", "doc3"],
)
# Query the collection
results = collection.query(
query_texts=["What is deep learning?"],
n_results=2,
)
print(results["documents"])
With Persistent Storage
import chromadb
from sie_chroma import SIEEmbeddingFunction
# Persistent client
client = chromadb.PersistentClient(path="./chroma_data")
embedding_function = SIEEmbeddingFunction(
base_url="http://localhost:8080",
model="BAAI/bge-m3",
)
# Get or create collection
collection = client.get_or_create_collection(
name="research_papers",
embedding_function=embedding_function,
)
# Add documents with metadata
collection.add(
documents=["Paper about transformers...", "Study on attention mechanisms..."],
metadatas=[{"year": 2023}, {"year": 2024}],
ids=["paper1", "paper2"],
)
# Query with metadata filtering
results = collection.query(
query_texts=["attention in neural networks"],
n_results=5,
where={"year": {"$gte": 2023}},
)
With LangChain or LlamaIndex
The SIEEmbeddingFunction works with ChromaDB's LangChain and LlamaIndex integrations:
# LangChain
from langchain_chroma import Chroma
from sie_chroma import SIEEmbeddingFunction
embedding_function = SIEEmbeddingFunction(model="BAAI/bge-m3")
vectorstore = Chroma(
collection_name="docs",
embedding_function=embedding_function, # Works directly!
)
# LlamaIndex
from llama_index.vector_stores.chroma import ChromaVectorStore
# SIE can also be used via LlamaIndex's SIEEmbedding
SIE Server
Start the SIE server before using this integration:
mise run serve -d cpu -p 8080
Testing
# Unit tests (no server required)
pytest
# Integration tests (requires running server)
pytest -m integration
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
sie_chroma-0.1.7.tar.gz
(10.6 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sie_chroma-0.1.7.tar.gz.
File metadata
- Download URL: sie_chroma-0.1.7.tar.gz
- Upload date:
- Size: 10.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
117dfc37956a38cc99478e90ad1e32958cff6be2e8a8246be3a25bb025f858ed
|
|
| MD5 |
bd569c84c03f39d4bee740e3d3bd67ed
|
|
| BLAKE2b-256 |
064d661f8f7d1298ffb9d724a3ce98b1ad890696bee8f52098d254d09de71f11
|
File details
Details for the file sie_chroma-0.1.7-py3-none-any.whl.
File metadata
- Download URL: sie_chroma-0.1.7-py3-none-any.whl
- Upload date:
- Size: 4.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bceaa41358f54553f06c9bd127f593bf0704e806795be80bb774f92b5ed7cb94
|
|
| MD5 |
1bdc12b2096b0aab1c62cc831ecc32bb
|
|
| BLAKE2b-256 |
5413da37e78dbc83b7f9d82f12984b3d2909623fa7ca27e732e6b07f2bcda9ef
|