Skip to main content

A collection of experimental Chroma extensions.

Project description

ChromaX: An experimental utilities package for Chroma vector database

Installation

pip install chromadbx

Features

  • ID generation
  • Embeddings
    • OnnxRuntime embeddings
    • Llama.cpp embeddings

Usage

Queries

Supported filters:

  • $eq - equal to (string, int, float)
  • $ne - not equal to (string, int, float)
  • $gt - greater than (int, float)
  • $gte - greater than or equal to (int, float)
  • $lt - less than (int, float)
  • $lte - less than or equal to (int, float)
  • $in - in (list of strings, ints, floats,bools)
  • $nin - not in (list of strings, ints, floats,bools)

Where:

import chromadb

from chromadbx.core.queries import eq, where, ne, and_

client = chromadb.PersistentClient(path="path/to/db")
collection = client.get_collection("collection_name")
collection.query(where=where(and_(eq("a", 1), ne("b", "2"))))
# {'$and': [{'a': ['$eq', 1]}, {'b': ['$ne', '2']}]}

Where Document:

import chromadb

from chromadbx.core.queries import where_document, contains, not_contains, LogicalOperator

client = chromadb.PersistentClient(path="path/to/db")
collection = client.get_collection("collection_name")
collection.query(where_document=where_document(contains("this is a document", "this is another document")))
# {'$and': [{'$contains': 'this is a document'}, {'$contains': 'this is another document'}]}
collection.query(
    where_document=where_document(contains("this is a document", "this is another document", op=LogicalOperator.OR)))
# {'$or': [{'$contains': 'this is a document'}, {'$contains': 'this is another document'}]}

ID Generation

import chromadb
from chromadbx import IDGenerator
from functools import partial
from typing import Generator

def sequential_generator(start: int = 0) -> Generator[str, None, None]:
        _next = start
        while True:
            yield f"{_next}"
            _next += 1
client = chromadb.Client()
col = client.get_or_create_collection("test")
my_docs = [f"Document {_}" for _ in range(10)]
idgen = IDGenerator(len(my_docs), generator=partial(sequential_generator, start=10))
col.add(ids=idgen, documents=my_docs)

UUIDs (default)

import chromadb
from chromadbx import UUIDGenerator

client = chromadb.Client()
col = client.get_or_create_collection("test")
my_docs = [f"Document {_}" for _ in range(10)]
col.add(ids=UUIDGenerator(len(my_docs)), documents=my_docs)

ULIDs

import chromadb
from chromadbx import ULIDGenerator
client = chromadb.Client()
col = client.get_or_create_collection("test")
my_docs = [f"Document {_}" for _ in range(10)]
col.add(ids=ULIDGenerator(len(my_docs)), documents=my_docs)

Hashes

Random SHA256:

import chromadb
from chromadbx import RandomSHA256Generator
client = chromadb.Client()
col = client.get_or_create_collection("test")
my_docs = [f"Document {_}" for _ in range(10)]
col.add(ids=RandomSHA256Generator(len(my_docs)), documents=my_docs)

Document-based SHA256:

import chromadb
from chromadbx import DocumentSHA256Generator
client = chromadb.Client()
col = client.get_or_create_collection("test")
my_docs = [f"Document {_}" for _ in range(10)]
col.add(ids=DocumentSHA256Generator(documents=my_docs), documents=my_docs)

NanoID

import chromadb
from chromadbx import NanoIDGenerator
client = chromadb.Client()
col = client.get_or_create_collection("test")
my_docs = [f"Document {_}" for _ in range(10)]
col.add(ids=NanoIDGenerator(len(my_docs)), documents=my_docs)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chromadbx-0.0.4.tar.gz (7.7 kB view details)

Uploaded Source

Built Distribution

chromadbx-0.0.4-py3-none-any.whl (9.1 kB view details)

Uploaded Python 3

File details

Details for the file chromadbx-0.0.4.tar.gz.

File metadata

  • Download URL: chromadbx-0.0.4.tar.gz
  • Upload date:
  • Size: 7.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.9.19 Linux/6.5.0-1025-azure

File hashes

Hashes for chromadbx-0.0.4.tar.gz
Algorithm Hash digest
SHA256 6ed42d7eac5c722e18d0255852185c8f2a15c649302b64d7e2498c977fe0f087
MD5 f4312b758b12531546a0265381d9b3cb
BLAKE2b-256 f2defceb150591d1e6310723617ea29ff2854b1956630e6cc33276fd9d5797f4

See more details on using hashes here.

File details

Details for the file chromadbx-0.0.4-py3-none-any.whl.

File metadata

  • Download URL: chromadbx-0.0.4-py3-none-any.whl
  • Upload date:
  • Size: 9.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.9.19 Linux/6.5.0-1025-azure

File hashes

Hashes for chromadbx-0.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 779a8ff35726973d52916834d2335c824d1e7315e66afa06ba4a192e1345040c
MD5 67293476362701c7da3c97373cf81ecc
BLAKE2b-256 6a7c81f2acbdd83d9cd30f1adfc30f9059b65ab83d726c300796e6f1e3357db0

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page