Skip to main content

A collection of experimental Chroma extensions.

Project description

ChromaX: An experimental utilities package for Chroma vector database

Installation

pip install chromadbx

Features

  • ID generation
  • Embeddings
    • OnnxRuntime embeddings
    • Llama.cpp embeddings

Usage

Queries

Supported filters:

  • $eq - equal to (string, int, float)
  • $ne - not equal to (string, int, float)
  • $gt - greater than (int, float)
  • $gte - greater than or equal to (int, float)
  • $lt - less than (int, float)
  • $lte - less than or equal to (int, float)
  • $in - in (list of strings, ints, floats,bools)
  • $nin - not in (list of strings, ints, floats,bools)

Where:

import chromadb

from chromadbx.core.queries import eq, where, ne, and_

client = chromadb.PersistentClient(path="path/to/db")
collection = client.get_collection("collection_name")
collection.query(where=where(and_(eq("a", 1), ne("b", "2"))))
# {'$and': [{'a': ['$eq', 1]}, {'b': ['$ne', '2']}]}

Where Document:

import chromadb

from chromadbx.core.queries import where_document, contains, not_contains, LogicalOperator

client = chromadb.PersistentClient(path="path/to/db")
collection = client.get_collection("collection_name")
collection.query(where_document=where_document(contains("this is a document", "this is another document")))
# {'$and': [{'$contains': 'this is a document'}, {'$contains': 'this is another document'}]}
collection.query(
    where_document=where_document(contains("this is a document", "this is another document", op=LogicalOperator.OR)))
# {'$or': [{'$contains': 'this is a document'}, {'$contains': 'this is another document'}]}

ID Generation

import chromadb
from chromadbx import IDGenerator
from functools import partial
from typing import Generator

def sequential_generator(start: int = 0) -> Generator[str, None, None]:
        _next = start
        while True:
            yield f"{_next}"
            _next += 1
client = chromadb.Client()
col = client.get_or_create_collection("test")
my_docs = [f"Document {_}" for _ in range(10)]
idgen = IDGenerator(len(my_docs), generator=partial(sequential_generator, start=10))
col.add(ids=idgen, documents=my_docs)

UUIDs (default)

import chromadb
from chromadbx import UUIDGenerator

client = chromadb.Client()
col = client.get_or_create_collection("test")
my_docs = [f"Document {_}" for _ in range(10)]
col.add(ids=UUIDGenerator(len(my_docs)), documents=my_docs)

ULIDs

import chromadb
from chromadbx import ULIDGenerator
client = chromadb.Client()
col = client.get_or_create_collection("test")
my_docs = [f"Document {_}" for _ in range(10)]
col.add(ids=ULIDGenerator(len(my_docs)), documents=my_docs)

Hashes

Random SHA256:

import chromadb
from chromadbx import RandomSHA256Generator
client = chromadb.Client()
col = client.get_or_create_collection("test")
my_docs = [f"Document {_}" for _ in range(10)]
col.add(ids=RandomSHA256Generator(len(my_docs)), documents=my_docs)

Document-based SHA256:

import chromadb
from chromadbx import DocumentSHA256Generator
client = chromadb.Client()
col = client.get_or_create_collection("test")
my_docs = [f"Document {_}" for _ in range(10)]
col.add(ids=DocumentSHA256Generator(documents=my_docs), documents=my_docs)

NanoID

import chromadb
from chromadbx import NanoIDGenerator
client = chromadb.Client()
col = client.get_or_create_collection("test")
my_docs = [f"Document {_}" for _ in range(10)]
col.add(ids=NanoIDGenerator(len(my_docs)), documents=my_docs)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chromadbx-0.0.5.tar.gz (8.2 kB view details)

Uploaded Source

Built Distribution

chromadbx-0.0.5-py3-none-any.whl (9.6 kB view details)

Uploaded Python 3

File details

Details for the file chromadbx-0.0.5.tar.gz.

File metadata

  • Download URL: chromadbx-0.0.5.tar.gz
  • Upload date:
  • Size: 8.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.9.19 Linux/6.5.0-1025-azure

File hashes

Hashes for chromadbx-0.0.5.tar.gz
Algorithm Hash digest
SHA256 0da143648db61a7271234eebc5c1f519e8c687dc2404341454be83e6a115c7e3
MD5 e7e02ad18fb3089806e9bdd210edaa92
BLAKE2b-256 58ad0de8092db08ff02a5036f50aaa3cf2c949aead40840a4625b7bc65b35369

See more details on using hashes here.

File details

Details for the file chromadbx-0.0.5-py3-none-any.whl.

File metadata

  • Download URL: chromadbx-0.0.5-py3-none-any.whl
  • Upload date:
  • Size: 9.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.9.19 Linux/6.5.0-1025-azure

File hashes

Hashes for chromadbx-0.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 975b20ae9d9213328f165a428264e216ec57966f517cd479d67117b61c003178
MD5 4761da58d1b146fc241bcdcc147eebe5
BLAKE2b-256 7c65c9ca0249557e58004f4ec744371f0d209f27209ce68842f403f125120d36

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page