Skip to main content

A package for cleaning and curating data with LLMs

Project description

databonsai

clean & curate your data with LLMs. databonsai logo

PyPI version License: MIT Python Version Code style: black

databonsai is a Python library that leverages Large Language Models (LLMs) to perform data cleaning, transformation, and categorization tasks. It provides a set of tools and utilities to simplify the process of working with LLMs and integrating them into your data pipelines.

Features

  • Categorization of data into predefined categories using LLMs
  • Transformation of data based on custom prompts and schemas
  • Decomposition of data into structured formats using LLMs
  • Retry logic with exponential backoff for handling rate limits and transient errors
  • Pydantic-based validation and configuration management

Installation

You can install databonsai using pip:

pip install databonsai

Usage

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

databonsai-0.1.0.tar.gz (8.1 kB view details)

Uploaded Source

Built Distribution

databonsai-0.1.0-py3-none-any.whl (11.7 kB view details)

Uploaded Python 3

File details

Details for the file databonsai-0.1.0.tar.gz.

File metadata

  • Download URL: databonsai-0.1.0.tar.gz
  • Upload date:
  • Size: 8.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.2

File hashes

Hashes for databonsai-0.1.0.tar.gz
Algorithm Hash digest
SHA256 d637fde356410730fd49bda9e38abc8d5a16445a27ce3b71e2d6216f095db7a0
MD5 6eaddcd979bb9707b8b18bfd150853f7
BLAKE2b-256 a348839efe59f57ad757b330b04330bea73684b6b89fbafe7af22e41fc0379f9

See more details on using hashes here.

File details

Details for the file databonsai-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: databonsai-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 11.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.2

File hashes

Hashes for databonsai-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 0474a29d30c4e432afc64b6cfdd73521614a1703e6529a5ed50e5300c31874c4
MD5 1a3930ab3cefdfd115e26f603107b735
BLAKE2b-256 99e2150cfe8c6aa351faeebbc531eb51b41296011c4f0dd667fa9200379e04ba

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page