Skip to main content

AI-powered dataset augmentation tool using Braintrust proxy

Project description

AUGR - AI Dataset Augmentation Tool

AI-powered dataset augmentation tool using Braintrust proxy with structured outputs.

Features

  • 🤖 Structured AI Outputs: Uses OpenAI's beta.chat.completions.parse with Pydantic schemas
  • 🧠 Braintrust Integration: Works with Braintrust proxy for multiple AI providers
  • 🔄 Interactive Workflows: Guided dataset augmentation with iterative refinement
  • 📊 Schema-aware Generation: Automatically infers and respects dataset schemas
  • Modern Tooling: Built with uv for fast dependency management

Installation

Option 1: Install from PyPI

# Install globally
pip install augr

# Or with pipx (recommended for CLI tools)
pipx install augr

# Or with uv
uv tool install augr

# Then use anywhere
augr

Option 2: Install from GitHub

# Install latest version
pip install git+https://github.com/Marviel/augr.git

# Or with uv
uv tool install git+https://github.com/Marviel/augr.git

# Then use anywhere
augr

Option 3: Development Setup

For development or local installation:

git clone https://github.com/Marviel/augr.git
cd augr
uv sync --all-extras --dev

# Test the installation
uv run python test_installation.py

# Use anywhere
uv run augr

Usage

First Run Setup

The first time you run AUGR, it will guide you through setup:

augr

AUGR will:

  1. Check for a Braintrust API key
  2. If none found, guide you to get one from https://www.braintrust.dev/app/settings/api-keys
  3. Save the key securely in ~/.augr/config.json
  4. Start the interactive tool

Configuration

AUGR checks for your API key in this order:

  1. BRAINTRUST_API_KEY environment variable
  2. ~/.augr/config.json file
  3. Interactive setup (first time)

Running

The tool provides an interactive CLI with two main modes:

  1. Guided Dataset Augmentation: Interactive workflow with iterative refinement
  2. Direct JSON Upload: Upload pre-generated samples directly
augr

Uninstalling

To completely remove AUGR and all its configuration:

augr uninstall
# or
augr-uninstall

This will:

  • Remove ~/.augr/ directory and all configuration
  • Uninstall the AUGR package

Development

Install with development dependencies:

uv pip install -e ".[dev]"

Run linting and formatting:

uv run black .
uv run ruff check .

Architecture

  • ai_client.py: Core AI interface with structured outputs
  • augmentation_service.py: Main service for dataset augmentation
  • cli.py: Interactive command-line interface
  • models.py: Pydantic models for data structures
  • braintrust_client.py: Braintrust API integration

API Example

from augr.ai_client import create_ai
from pydantic import BaseModel

class Response(BaseModel):
    message: str
    confidence: float

# Create AI client (reads BRAINTRUST_API_KEY from env)
ai = create_ai(model="gpt-4o", temperature=0.0)

# Generate structured output
result = await ai.gen_obj(
    schema=Response,
    messages=[{"role": "user", "content": "Hello!"}],
    thinking_enabled=True  # For reasoning models
)

print(result.message)  # Structured output

Contributing

Making a Release

This project uses automated releases via GitHub Actions:

  1. Update version in pyproject.toml
  2. Create and push a git tag: git tag -a v0.2.0 -m "Release v0.2.0" && git push origin v0.2.0
  3. GitHub Actions will automatically:
    • Run tests
    • Build the package
    • Upload to PyPI
    • Create GitHub release

See RELEASE.md for detailed instructions.

Development

# Clone and setup
git clone https://github.com/Marviel/augr.git
cd augr
uv sync --all-extras --dev

# Run tests
uv run python test_installation.py

# Format code
uv run black .
uv run ruff check --fix .

# Build package
uv run python -m build

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

augr-0.2.0.tar.gz (106.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

augr-0.2.0-py3-none-any.whl (25.6 kB view details)

Uploaded Python 3

File details

Details for the file augr-0.2.0.tar.gz.

File metadata

  • Download URL: augr-0.2.0.tar.gz
  • Upload date:
  • Size: 106.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for augr-0.2.0.tar.gz
Algorithm Hash digest
SHA256 72e3d3ae9fd93259345c2486efd3f417e07a09d1af51c8592c71e7214282b607
MD5 a0fb0f67ef4e5048aeb2daf43fe0959c
BLAKE2b-256 c2f9862159094f5d71daea0d06cb0f1e1006715e5d223558cfd8d5d7d680e65f

See more details on using hashes here.

File details

Details for the file augr-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: augr-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 25.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for augr-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 02017b7290dab27d5b1019fefb8230e2e17bd4d5c6ea2a1f5851379e20a7c5a8
MD5 eaedf06856caa38c3375c4aa26d8e078
BLAKE2b-256 9ab93a0cb2639e8bcf176f6f6ba1913e2e28f1cc32b8c863f0dcea11fce1bdec

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page