Tree-based prompt compression library using cut-then-transform strategy for LLM applications

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

napmany

These details have not been verified by PyPI

Project description

CUTIA: Quality-Aware Prompt Compressor

A prompt optimizer that cuts token usage while maintaining quality.

GitHub Actions Workflow Status

Features

Tree-based Segmentation: Recursively splits prompts into segments for fine-grained optimization
Cut-then-Rewrite Strategy: Attempts to remove redundant content, then rewrites if cutting fails
Quality-Aware Compression: Maintains quality thresholds during compression
Multi-Candidate Generation: Generates multiple compression variants and chooses the best
DSPy Integration: First-class support for DSPy programs via the DSPy adapter

Installation

pip install cutia

Usage

DSPy Adapter

The DSPy adapter allows you to compress DSPy programs:

import dspy
from cutia.adapters.dspy_adapter import CUTIA

# Configure models
# prompt_model generates rewrite candidates
prompt_model = dspy.LM(
    model="openai/gpt-4o-mini",
    max_tokens=10000,
    temperature=1,
)
# task_model runs the task/program for scoring and validation
task_model = dspy.LM(
    model="openai/gpt-4.1-nano",
    max_tokens=2000,
    temperature=1,
)

# Define your metric
def your_metric(example, prediction, trace=None):
    return example.output == prediction.output

# Create optimizer
optimizer = CUTIA(
    prompt_model=prompt_model,
    task_model=task_model,
    metric=your_metric,
)

# Compile your program
compressed_program = optimizer.compile(
    student=your_program,
    trainset=train_examples,
    valset=val_examples,
)

Local AI

If you’re running CUTIA (or other prompt optimizers) against locally hosted LLMs, vLLM is a solid option for serving models: it supports high-throughput inference and handles concurrent requests efficiently.
vLLM

If you’d like to use a separate prompt model from the task model, llmsnap can help by enabling fast model switching via vLLM’s sleep/wake mode—so you can swap models in seconds.
llmsnap

How It Works

Tree Building: The prompt is recursively split into segments (left, chunk, right)
Node Processing: For each node in the tree:
- Attempt to cut the chunk entirely
- If cutting fails quality check, attempt to rewrite the chunk
- Keep original if both fail
Multi-Candidate: Generate multiple compression variants with different random seeds
Selection: Evaluate candidates on validation set and select the best

Examples

Strawberry Problem (Letter Counting)

Demonstrates prompt compression on a character counting task using the CharBench dataset.

See src/cutia/examples/README.md for details.

Development

Development Installation

For development with testing and linting tools:

# Clone the repository
git clone https://github.com/napmany/cutia.git
cd cutia

# Install with development dependencies
uv sync --extra dev

Running Tests

# Install development dependencies (if not already installed)
uv sync --extra dev

# Run tests
make test

Code Quality

The project uses Ruff for linting and formatting, and Pyright for type checking:

# Run all checks (linting, formatting, and type checking)
make check

Dependencies

Core

No required dependencies for the base library

Install optional dependencies:

# For testing
uv sync --extra test

# For development (includes test dependencies)
uv sync --extra dev

Future Plans

Framework-agnostic core implementation (not tied to DSPy)
Additional adapters for other frameworks and platforms (LangChain, MLflow, etc.)
Standalone Python API for direct use
Enhanced chunking strategies

Star History

[!NOTE] ⭐️ Star this project to help others discover it!

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

napmany

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.0.2

Dec 22, 2025

0.0.1

Dec 19, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cutia-0.0.2.tar.gz (26.7 kB view details)

Uploaded Dec 22, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

cutia-0.0.2-py3-none-any.whl (29.6 kB view details)

Uploaded Dec 22, 2025 Python 3

File details

Details for the file cutia-0.0.2.tar.gz.

File metadata

Download URL: cutia-0.0.2.tar.gz
Upload date: Dec 22, 2025
Size: 26.7 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for cutia-0.0.2.tar.gz
Algorithm	Hash digest
SHA256	`6036340a4c8b7c736c0c30646c886248238c6ac9aacc7e99d3bc263a24c58610`
MD5	`ed9f30094f0059ff3b70d11ab90dd4d6`
BLAKE2b-256	`e01312599f0937444a0a5f48315265a1bdaaba9f21ad6f36d05d916683010c8a`

See more details on using hashes here.

Provenance

The following attestation bundles were made for cutia-0.0.2.tar.gz:

Publisher: publish.yml on napmany/cutia

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: cutia-0.0.2.tar.gz
- Subject digest: 6036340a4c8b7c736c0c30646c886248238c6ac9aacc7e99d3bc263a24c58610
- Sigstore transparency entry: 775538343
- Sigstore integration time: Dec 22, 2025
Source repository:
- Permalink: napmany/cutia@8a930b2e5dc6b79c36d01ab000e40c3a29b17153
- Branch / Tag: refs/tags/v0.0.2
- Owner: https://github.com/napmany
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@8a930b2e5dc6b79c36d01ab000e40c3a29b17153
- Trigger Event: push

File details

Details for the file cutia-0.0.2-py3-none-any.whl.

File metadata

Download URL: cutia-0.0.2-py3-none-any.whl
Upload date: Dec 22, 2025
Size: 29.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for cutia-0.0.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`367b14e35f75bb0398ef68267d4a74ff531db78503f9f1e4303bcb7f679f3a8b`
MD5	`f24a26683067aa066d6826c7020e6b7f`
BLAKE2b-256	`7b264ecc725da8446bb3163c0c400c875581aa1ee6bcf1a37faab372ac6ac1cf`

See more details on using hashes here.

Provenance

The following attestation bundles were made for cutia-0.0.2-py3-none-any.whl:

Publisher: publish.yml on napmany/cutia

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: cutia-0.0.2-py3-none-any.whl
- Subject digest: 367b14e35f75bb0398ef68267d4a74ff531db78503f9f1e4303bcb7f679f3a8b
- Sigstore transparency entry: 775538409
- Sigstore integration time: Dec 22, 2025
Source repository:
- Permalink: napmany/cutia@8a930b2e5dc6b79c36d01ab000e40c3a29b17153
- Branch / Tag: refs/tags/v0.0.2
- Owner: https://github.com/napmany
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@8a930b2e5dc6b79c36d01ab000e40c3a29b17153
- Trigger Event: push

cutia 0.0.2

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

CUTIA: Quality-Aware Prompt Compressor

Features

Installation

Usage

DSPy Adapter

Local AI

How It Works

Examples

Strawberry Problem (Letter Counting)

Development

Development Installation

Running Tests

Code Quality

Dependencies

Core

Future Plans

Star History

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance