Skip to main content

AI-powered taxonomy generation for your data

Project description

Delve: AI-Powered Taxonomy Generation

Delve is a production-ready SDK and CLI for automatically generating taxonomies from your data using state-of-the-art language models.

📚 Read the full documentation →

Quick Start

Installation

pip install delve-taxonomy

# Set API keys
export ANTHROPIC_API_KEY="your-key-here"
export OPENAI_API_KEY="your-key-here"  # Required for classifier embeddings

CLI

# Basic usage (shows progress spinners)
delve run data.csv --text-column text

# With progress bars and ETA
delve run data.csv --text-column text -v

# Quiet mode (errors only)
delve run data.csv --text-column text -q

# JSON with nested data
delve run data.json --json-path "$.messages[*].content"

Python SDK

from delve import Delve, Verbosity

# Initialize client (silent by default - library best practice)
delve = Delve()

# Or with progress output
delve = Delve(verbosity=Verbosity.NORMAL)

# Run taxonomy generation
result = delve.run_sync("data.csv", text_column="text")

# Access results
print(f"Generated {len(result.taxonomy)} categories")
for category in result.taxonomy:
    print(f"  - {category.name}: {category.description}")

# Access labeled documents
for doc in result.labeled_documents[:5]:
    print(f"  [{doc.category}] {doc.content[:50]}...")

Features

  • Automated Taxonomy Generation - No manual category creation using Claude 3.5 Sonnet
  • Multiple Data Sources - CSV, JSON/JSONL, LangSmith runs, pandas DataFrames
  • Smart Categorization - Iterative refinement with minibatch clustering
  • Flexible Exports - JSON, CSV, and Markdown reports

Requirements

  • Python 3.9+
  • Anthropic API key (for taxonomy generation)
  • OpenAI API key (for classifier embeddings when sample_size > 0)

Documentation

Development

# Install dependencies
uv sync

# Run tests
pytest tests/

# Run linting
ruff check src/

# Format code
ruff format src/

Documentation Development

To work on the documentation locally, you'll need Node.js 20.17+ (for Mintlify):

# If using nvm, the project includes .nvmrc
nvm use

# Install Mintlify CLI (if not already installed)
npm install -g mintlify

# Run the docs server
cd docs
mintlify dev

See the full documentation for more details on contributing and development.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

delve_taxonomy-0.1.10.tar.gz (41.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

delve_taxonomy-0.1.10-py3-none-any.whl (54.5 kB view details)

Uploaded Python 3

File details

Details for the file delve_taxonomy-0.1.10.tar.gz.

File metadata

  • Download URL: delve_taxonomy-0.1.10.tar.gz
  • Upload date:
  • Size: 41.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for delve_taxonomy-0.1.10.tar.gz
Algorithm Hash digest
SHA256 f3caabf7f65b0a94ee2cb5cfaac65db5c0313ada62daf9996c97474d4c2b8245
MD5 4890028a9ac16546cd24c9ac0413af16
BLAKE2b-256 236f87e2f54a78a8743bbf8896f6ca73960408766d21a01e408003a66f3f47bd

See more details on using hashes here.

File details

Details for the file delve_taxonomy-0.1.10-py3-none-any.whl.

File metadata

File hashes

Hashes for delve_taxonomy-0.1.10-py3-none-any.whl
Algorithm Hash digest
SHA256 45ab203b5dd777256154b9054138aad59735924cfb673a517cf51533b99cb77f
MD5 f6b2dec50806b4057f1ee3d0eb0460c7
BLAKE2b-256 4f0d8be44e8df79d9389d76da3b13d032acc84d7efe6e3ed0c1840b96c8ef713

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page