A collection of experimental Chroma extensions.
Project description
ChromaX: An experimental utilities package for Chroma vector database
Installation
pip install chromadbx
Features
- Query Builder - build queries using a builder pattern
- ID generation - generate IDs for documents
- Embeddings - generate embeddings for your documents:
- OnnxRuntime embeddings
- Llama.cpp embeddings
- Google Vertex AI embeddings
- Mistral AI embeddings
- Cloudflare Workers AI embeddings
Usage
Queries
Supported filters:
$eq
- equal to (string, int, float)$ne
- not equal to (string, int, float)$gt
- greater than (int, float)$gte
- greater than or equal to (int, float)$lt
- less than (int, float)$lte
- less than or equal to (int, float)$in
- in (list of strings, ints, floats,bools)$nin
- not in (list of strings, ints, floats,bools)
Where:
import chromadb
from chromadbx.core.queries import eq, where, ne, and_
client = chromadb.PersistentClient(path="path/to/db")
collection = client.get_collection("collection_name")
collection.query(where=where(and_(eq("a", 1), ne("b", "2"))))
# {'$and': [{'a': ['$eq', 1]}, {'b': ['$ne', '2']}]}
Where Document:
import chromadb
from chromadbx.core.queries import where_document, contains, not_contains, LogicalOperator
client = chromadb.PersistentClient(path="path/to/db")
collection = client.get_collection("collection_name")
collection.query(where_document=where_document(contains("this is a document", "this is another document")))
# {'$and': [{'$contains': 'this is a document'}, {'$contains': 'this is another document'}]}
collection.query(
where_document=where_document(contains("this is a document", "this is another document", op=LogicalOperator.OR)))
# {'$or': [{'$contains': 'this is a document'}, {'$contains': 'this is another document'}]}
ID Generation
import chromadb
from chromadbx import IDGenerator
from functools import partial
from typing import Generator
def sequential_generator(start: int = 0) -> Generator[str, None, None]:
_next = start
while True:
yield f"{_next}"
_next += 1
client = chromadb.Client()
col = client.get_or_create_collection("test")
my_docs = [f"Document {_}" for _ in range(10)]
idgen = IDGenerator(len(my_docs), generator=partial(sequential_generator, start=10))
col.add(ids=idgen, documents=my_docs)
UUIDs (default)
import chromadb
from chromadbx import UUIDGenerator
client = chromadb.Client()
col = client.get_or_create_collection("test")
my_docs = [f"Document {_}" for _ in range(10)]
col.add(ids=UUIDGenerator(len(my_docs)), documents=my_docs)
ULIDs
import chromadb
from chromadbx import ULIDGenerator
client = chromadb.Client()
col = client.get_or_create_collection("test")
my_docs = [f"Document {_}" for _ in range(10)]
col.add(ids=ULIDGenerator(len(my_docs)), documents=my_docs)
Hashes
Random SHA256:
import chromadb
from chromadbx import RandomSHA256Generator
client = chromadb.Client()
col = client.get_or_create_collection("test")
my_docs = [f"Document {_}" for _ in range(10)]
col.add(ids=RandomSHA256Generator(len(my_docs)), documents=my_docs)
Document-based SHA256:
import chromadb
from chromadbx import DocumentSHA256Generator
client = chromadb.Client()
col = client.get_or_create_collection("test")
my_docs = [f"Document {_}" for _ in range(10)]
col.add(ids=DocumentSHA256Generator(documents=my_docs), documents=my_docs)
NanoID
import chromadb
from chromadbx import NanoIDGenerator
client = chromadb.Client()
col = client.get_or_create_collection("test")
my_docs = [f"Document {_}" for _ in range(10)]
col.add(ids=NanoIDGenerator(len(my_docs)), documents=my_docs)
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
chromadbx-0.0.6.tar.gz
(10.4 kB
view details)
Built Distribution
chromadbx-0.0.6-py3-none-any.whl
(13.2 kB
view details)
File details
Details for the file chromadbx-0.0.6.tar.gz
.
File metadata
- Download URL: chromadbx-0.0.6.tar.gz
- Upload date:
- Size: 10.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.4 CPython/3.9.20 Linux/6.5.0-1025-azure
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 253d2f2c0b51775895052b2d776776b81b1e015b436ce6acf03d07a1d183cc9a |
|
MD5 | b9e7d82e70a321e531a1db79467b5568 |
|
BLAKE2b-256 | 9618b76fc2a74b56a119cff244bb0444361a48602d2ec8fcc832c82b25d6d782 |
File details
Details for the file chromadbx-0.0.6-py3-none-any.whl
.
File metadata
- Download URL: chromadbx-0.0.6-py3-none-any.whl
- Upload date:
- Size: 13.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.4 CPython/3.9.20 Linux/6.5.0-1025-azure
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1f8dc85dde4ec2bd4cbfae438061abb6b542f3583cb9d2c4c2e49e036d4ff743 |
|
MD5 | e8a10bebab138123c961479727caf126 |
|
BLAKE2b-256 | 14e9dafe106fce336883b7c961a6670a744f4cabc70a31ab85b2fc05593955df |