Skip to main content

AI-powered taxonomy generation for your data

Project description

Delve: AI-Powered Taxonomy Generation

Delve is a production-ready SDK and CLI for automatically generating taxonomies from your data using state-of-the-art language models.

📚 Read the full documentation →

Quick Start

Installation

pip install delve-taxonomy

# Set API keys
export ANTHROPIC_API_KEY="your-key-here"
export OPENAI_API_KEY="your-key-here"  # Required for classifier embeddings

CLI

# Basic usage (shows progress spinners)
delve run data.csv --text-column text

# With progress bars and ETA
delve run data.csv --text-column text -v

# Quiet mode (errors only)
delve run data.csv --text-column text -q

# JSON with nested data
delve run data.json --json-path "$.messages[*].content"

Python SDK

from delve import Delve, Verbosity

# Initialize client (silent by default - library best practice)
delve = Delve()

# Or with progress output
delve = Delve(verbosity=Verbosity.NORMAL)

# Run taxonomy generation
result = delve.run_sync("data.csv", text_column="text")

# Access results
print(f"Generated {len(result.taxonomy)} categories")
for category in result.taxonomy:
    print(f"  - {category.name}: {category.description}")

# Access labeled documents
for doc in result.labeled_documents[:5]:
    print(f"  [{doc.category}] {doc.content[:50]}...")

Features

  • Automated Taxonomy Generation - No manual category creation using Claude 3.5 Sonnet
  • Multiple Data Sources - CSV, JSON/JSONL, LangSmith runs, pandas DataFrames
  • Smart Categorization - Iterative refinement with minibatch clustering
  • Flexible Exports - JSON, CSV, and Markdown reports

Requirements

  • Python 3.9+
  • Anthropic API key (for taxonomy generation)
  • OpenAI API key (for classifier embeddings when sample_size > 0)

Documentation

Development

# Install dependencies
uv sync

# Run tests
pytest tests/

# Run linting
ruff check src/

# Format code
ruff format src/

Documentation Development

To work on the documentation locally, you'll need Node.js 20.17+ (for Mintlify):

# If using nvm, the project includes .nvmrc
nvm use

# Install Mintlify CLI (if not already installed)
npm install -g mintlify

# Run the docs server
cd docs
mintlify dev

See the full documentation for more details on contributing and development.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

delve_taxonomy-0.1.11.tar.gz (49.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

delve_taxonomy-0.1.11-py3-none-any.whl (62.5 kB view details)

Uploaded Python 3

File details

Details for the file delve_taxonomy-0.1.11.tar.gz.

File metadata

  • Download URL: delve_taxonomy-0.1.11.tar.gz
  • Upload date:
  • Size: 49.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for delve_taxonomy-0.1.11.tar.gz
Algorithm Hash digest
SHA256 e8a7ae193d4430aa61d3732f3681e2df7d3fc805c069a6121f3a81c3749b7e10
MD5 0e0c69f5d45d6668a21d3a1d1cd25760
BLAKE2b-256 f27204d12b1d676e9b45e92e387d981411f6825e678a55d34eb0782e92d83f5b

See more details on using hashes here.

File details

Details for the file delve_taxonomy-0.1.11-py3-none-any.whl.

File metadata

File hashes

Hashes for delve_taxonomy-0.1.11-py3-none-any.whl
Algorithm Hash digest
SHA256 e4f35eb3bb9bdc8dca4f38a1d676834c43647832b6598dce70dd58f8cb26de76
MD5 8fdcb0c2e9f542d9431a615f22c9d70c
BLAKE2b-256 655dcd86db91a7c18c303165b9769761b5cf6bc775ce94d4a012833ebedb612f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page