Skip to main content

Tree-based prompt compression library using cut-then-transform strategy for LLM applications

Project description

CUTIA - Cut-Then-Implement-Augment Prompt Compressor

CUTIA is a tree-based prompt compression library that uses a cut-then-transform strategy to compress prompts while maintaining quality.

Features

  • Tree-based Segmentation: Recursively splits prompts into segments for fine-grained optimization
  • Cut-then-Rewrite Strategy: Attempts to remove redundant content, then rewrites if cutting fails
  • Quality-Aware Compression: Maintains quality thresholds during compression
  • Multi-Candidate Generation: Generates multiple compression variants with different random seeds
  • DSPy Integration: First-class support for DSPy programs via the DSPy adapter

Installation

Basic Installation

pip install cutia

Development Installation

For development with testing and linting tools:

# Clone the repository
git clone <repository-url>
cd cutia

# Install with development dependencies
uv sync --extra dev

Usage

DSPy Adapter

The DSPy adapter allows you to compress DSPy programs:

import dspy
from cutia.adapters.dspy_adapter import CUTIA

# Configure models
prompt_model = dspy.LM("gpt-4o-mini")
task_model = dspy.LM("gpt-4o-mini")

# Define your metric
def your_metric(example, prediction, trace=None):
    return example.output == prediction.output

# Create optimizer
optimizer = CUTIA(
    prompt_model=prompt_model,
    task_model=task_model,
    metric=your_metric,
    quality_mode="strict",  # "strict", "balanced", or "aggressive"
    target_compression_ratio=0.5,
    num_candidates=4,
    traversal_strategy="pre_order",  # "pre_order", "post_order", or "random"
)

# Compile your program
compressed_program = optimizer.compile(
    student=your_program,
    trainset=train_examples,
    valset=val_examples,
)

Quality Modes

CUTIA supports three quality modes:

  • "strict": No score degradation allowed (threshold: baseline + 0.0%)

    • Use case: Safety-critical prompts, zero quality loss tolerance
    • Expected compression: 10-20%
  • "balanced": Moderate degradation allowed (threshold: baseline - 5.0%) - Default

    • Use case: Most applications, good quality/compression balance
    • Expected compression: 25-40%
  • "aggressive": Larger degradation allowed (threshold: baseline - 10.0%)

    • Use case: Maximum compression priority, quality less critical
    • Expected compression: 40-60%

Traversal Strategies

  • "post_order": Process children before parent (bottom-up)
  • "pre_order": Process parent before children (top-down)
  • "random": Randomly choose between post-order and pre-order for each candidate

Development

Running Tests

The project uses pytest for testing. All tests are designed to run without making actual LLM calls.

# Install development dependencies (if not already installed)
uv sync --extra dev

# Run all tests
uv run pytest tests/

# Run with verbose output
uv run pytest tests/ -v

# Run specific test file
uv run pytest tests/adapters/dspy_adapter/test_cutia_basic.py

# Run specific test function
uv run pytest tests/adapters/dspy_adapter/test_cutia_basic.py::test_cutia_basic_compile

# Run tests with coverage (if pytest-cov installed)
uv run pytest tests/ --cov=cutia --cov-report=term-missing

Code Quality

The project uses Ruff for linting and formatting, and Pyright for type checking:

# Linting
ruff check src/

# Formatting
ruff format src/

# Type checking
uv run pyright

# Run all checks (linting, formatting, and type checking)
make check

Alternatively, use the Makefile commands:

# Individual checks
make lint          # Run linting only
make format        # Run formatting only
make typecheck     # Run type checking only

# Combined checks
make check         # Run all quality checks (linting, formatting, type checking)
make fix           # Auto-fix linting and formatting issues

Type Checking

The project uses Pyright for static type checking. The configuration is in pyproject.toml under [tool.pyright].

# Run type checking manually
uv run pyright

# Run in watch mode (useful during development)
make typecheck-watch

Pre-commit Hooks

# Install pre-commit hooks
pre-commit install

# Run manually
pre-commit run --all-files

How It Works

  1. Tree Building: The prompt is recursively split into segments (left, chunk, right)
  2. Node Processing: For each node in the tree:
    • Attempt to cut the chunk entirely
    • If cutting fails quality check, attempt to rewrite the chunk
    • Keep original if both fail
  3. Multi-Candidate: Generate multiple compression variants with different random seeds
  4. Selection: Evaluate candidates on validation set and select the best

Dependencies

Core

  • No required dependencies for the base library

Optional: DSPy Adapter

  • dspy-ai>=3.0.0 - For DSPy integration

Development

  • pytest>=8.0.0 - Testing framework
  • ruff>=0.3.0 - Linting and formatting
  • pyright>=1.1.0 - Static type checking
  • pre-commit - Git hooks

Install optional dependencies:

# For testing
uv sync --extra test

# For development (includes test dependencies)
uv sync --extra dev

Future Plans

  • Framework-agnostic core implementation (not tied to DSPy)
  • Additional adapters for other frameworks and platforms (LangChain, MLflow, etc.)
  • Standalone Python API for direct use
  • Enhanced chunking strategies

Contributing

Contributions are welcome! Please ensure:

  1. All tests pass: uv run pytest tests/
  2. Code is formatted: ruff format src/
  3. No linting errors: ruff check src/
  4. Type checking passes: uv run pyright (optional but recommended)
  5. Pre-commit hooks pass: pre-commit run --all-files
  6. Add tests for new features

Or run make check to verify linting, formatting, and type checking in one command.

Note: Type checking is verified by make check but not enforced in pre-commit hooks, allowing for faster commits while still maintaining quality standards.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cutia-0.0.1.tar.gz (18.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cutia-0.0.1-py3-none-any.whl (19.4 kB view details)

Uploaded Python 3

File details

Details for the file cutia-0.0.1.tar.gz.

File metadata

  • Download URL: cutia-0.0.1.tar.gz
  • Upload date:
  • Size: 18.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for cutia-0.0.1.tar.gz
Algorithm Hash digest
SHA256 2feaf83ee7ea579d4e0e0db0df41d575f8e1c2da8a767865315a235f5b2b3eca
MD5 4bf439cc225bb6b8aa591a367a96dfe8
BLAKE2b-256 9936b64c84da495dcbdfbcc0b65a00f67a99a436079d199a09f1ce50e9dac55e

See more details on using hashes here.

Provenance

The following attestation bundles were made for cutia-0.0.1.tar.gz:

Publisher: publish.yml on napmany/cutia

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file cutia-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: cutia-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 19.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for cutia-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 32ab7f7f2dcab73735100daeb816280de998747caae865ed977d725dd715e008
MD5 df45c12bf453c785ff9281a47f40cf2b
BLAKE2b-256 0cb41d6788b76a77facf1346618fe7fa5ca5c0c3532b23fcd215fd6e291d53be

See more details on using hashes here.

Provenance

The following attestation bundles were made for cutia-0.0.1-py3-none-any.whl:

Publisher: publish.yml on napmany/cutia

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page