Skip to main content

A high-performance Python library for semantically compressing and optimizing data before sending it to a Large Language Model (LLM).

Project description

Brevit.py

A high-performance Python library for semantically compressing and optimizing data before sending it to a Large Language Model (LLM). Dramatically reduce token costs while maintaining data integrity and readability.

Table of Contents

Why Brevit.py?

Python-Specific Advantages

  • Async/Await: Built with modern Python async/await patterns
  • Type Hints: Full type annotations for better IDE support
  • LangChain Integration: Ready for LangChain workflows
  • FastAPI/Flask Compatible: Works seamlessly with popular web frameworks
  • Pydantic Support: Integrates with Pydantic models

Performance Benefits

  • 40-60% Token Reduction: Dramatically reduce LLM API costs
  • Async Operations: Non-blocking I/O for better concurrency
  • Memory Efficient: Processes data in-place where possible
  • Fast Execution: Optimized algorithms for minimal overhead

Example Cost Savings

# Before: 234 tokens = $0.000468 per request
json_str = json.dumps(complex_order)

# After: 127 tokens = $0.000254 per request (46% reduction)
optimized = await brevit.brevity(complex_order)  # Automatic optimization

# Or with explicit configuration
explicit = await brevit.optimize(complex_order)

# Savings: $0.000214 per request
# At 1M requests/month: $214/month savings

Automatic Strategy Selection

Brevit.py now includes the .brevity() method that automatically analyzes your data and selects the optimal optimization strategy:

data = {
    "friends": ["ana", "luis", "sam"],
    "hikes": [
        {"id": 1, "name": "Blue Lake Trail", "distanceKm": 7.5},
        {"id": 2, "name": "Ridge Overlook", "distanceKm": 9.2}
    ]
}

# Automatically detects uniform arrays and applies tabular format
optimized = await brevit.brevity(data)
# No configuration needed - Brevit analyzes and optimizes automatically!

Key Features

  • JSON Optimization: Flatten nested JSON structures into token-efficient key-value pairs
  • Text Optimization: Clean and summarize long text documents
  • Image Optimization: Extract text from images via OCR
  • Async/Await: Built with modern Python async/await patterns
  • Extensible: Plugin architecture for custom optimizers
  • Lightweight: Minimal dependencies, high performance
  • Type Hints: Full type annotations for better IDE support

Installation

Prerequisites

  • Python 3.8 or later
  • pip or poetry

Install via pip

pip install brevit

Install from Source

  1. Clone the repository:
git clone https://github.com/JavianDev/Brevit.py.git
cd Brevit.py
  1. Install in development mode:
pip install -e .

Optional Dependencies

For YAML support:

pip install brevit[yaml]
# or
pip install PyYAML

For JSON path filtering:

pip install brevit[jsonpath]
# or
pip install jsonpath-ng

Quick Start

Basic Usage

from brevit import BrevitClient, BrevitConfig, JsonOptimizationMode
import asyncio

async def main():
    # 1. Create configuration
    config = BrevitConfig(
        json_mode=JsonOptimizationMode.Flatten,
        text_mode=TextOptimizationMode.Clean,
        image_mode=ImageOptimizationMode.Ocr,
        long_text_threshold=1000  # Summarize text over 1000 chars
    )

    # 2. Create client
    brevit = BrevitClient(config)

    # 3. Optimize data
    order = {
        "orderId": "o-456",
        "status": "SHIPPED",
        "items": [
            {"sku": "A-88", "name": "Brevit Pro License", "quantity": 1}
        ]
    }

    optimized = await brevit.optimize(order)
    # Result (with abbreviations enabled by default):
    # "@o=order\n@o.orderId:o-456\n@o.status:SHIPPED\n@o.items[1]{name,quantity,sku}:\nBrevit Pro License,1,A-88"
    print(optimized)

asyncio.run(main())

Abbreviation Feature (New in v0.1.2)

Brevit automatically creates abbreviations for frequently repeated prefixes, reducing token usage by 10-25%:

from brevit import BrevitClient, BrevitConfig, JsonOptimizationMode

config = BrevitConfig(
    json_mode=JsonOptimizationMode.Flatten,
    enable_abbreviations=True,    # Enabled by default
    abbreviation_threshold=2           # Minimum occurrences to abbreviate
)
brevit = BrevitClient(config)

data = {
    "user": {
        "name": "John Doe",
        "email": "john@example.com",
        "age": 30
    },
    "order": {
        "id": "o-456",
        "status": "SHIPPED"
    }
}

optimized = await brevit.brevity(data)
# Output with abbreviations:
# @u=user
# @o=order
# @u.name:John Doe
# @u.email:john@example.com
# @u.age:30
# @o.id:o-456
# @o.status:SHIPPED

Token Savings: The abbreviation feature reduces tokens by replacing repeated prefixes like "user." and "order." with short aliases like "@u" and "@o", saving 10-25% on typical nested JSON structures.

Complete Usage Examples

Brevit.py supports three main data types: JSON objects/strings, text files/strings, and images. Here's how to use each:

1. JSON Optimization Examples

Example 1.1: Simple JSON Object

from brevit import BrevitClient, BrevitConfig, JsonOptimizationMode

brevit = BrevitClient(BrevitConfig(json_mode=JsonOptimizationMode.Flatten))

data = {
    "user": {
        "name": "John Doe",
        "email": "john@example.com",
        "age": 30
    }
}

# Method 1: Automatic optimization (recommended)
optimized = await brevit.brevity(data)
# Output:
# user.name:John Doe
# user.email:john@example.com
# user.age:30

# Method 2: Explicit optimization
explicit = await brevit.optimize(data)

Example 1.2: JSON String

json_string = '{"order": {"id": "o-456", "status": "SHIPPED"}}'

# Brevit automatically detects JSON strings
optimized = await brevit.brevity(json_string)
# Output:
# order.id:o-456
# order.status:SHIPPED

Example 1.3: Complex Nested JSON with Arrays

complex_data = {
    "context": {
        "task": "Our favorite hikes together",
        "location": "Boulder",
        "season": "spring_2025"
    },
    "friends": ["ana", "luis", "sam"],
    "hikes": [
        {
            "id": 1,
            "name": "Blue Lake Trail",
            "distanceKm": 7.5,
            "elevationGain": 320,
            "companion": "ana",
            "wasSunny": True
        },
        {
            "id": 2,
            "name": "Ridge Overlook",
            "distanceKm": 9.2,
            "elevationGain": 540,
            "companion": "luis",
            "wasSunny": False
        }
    ]
}

optimized = await brevit.brevity(complex_data)
# Output:
# context.task:Our favorite hikes together
# context.location:Boulder
# context.season:spring_2025
# friends[3]:ana,luis,sam
# hikes[2]{companion,distanceKm,elevationGain,id,name,wasSunny}:
# ana,7.5,320,1,Blue Lake Trail,true
# luis,9.2,540,2,Ridge Overlook,false

Example 1.4: Different JSON Optimization Modes

# Flatten Mode (Default)
flatten_config = BrevitConfig(json_mode=JsonOptimizationMode.Flatten)
# Converts nested JSON to flat key-value pairs

# YAML Mode
yaml_config = BrevitConfig(json_mode=JsonOptimizationMode.ToYaml)
# Converts JSON to YAML format (requires PyYAML)

# Filter Mode
filter_config = BrevitConfig(
    json_mode=JsonOptimizationMode.Filter,
    json_paths_to_keep=["user.name", "order.id"]
)
# Keeps only specified paths, removes everything else

2. Text Optimization Examples

Example 2.1: Long Text String

long_text = "This is a very long document..." * 100

config = BrevitConfig(
    json_mode=JsonOptimizationMode.None,
    text_mode=TextOptimizationMode.Clean,
    long_text_threshold=500
)
brevit = BrevitClient(config)

# Automatic detection
optimized = await brevit.brevity(long_text)

# Explicit text optimization
cleaned = await brevit.optimize(long_text)

Example 2.2: Reading Text from File

# Read text file
with open('document.txt', 'r', encoding='utf-8') as f:
    text_content = f.read()

# Optimize the text
optimized = await brevit.brevity(text_content)

Example 2.3: Text Optimization Modes

# Clean Mode (Remove Boilerplate)
clean_config = BrevitConfig(text_mode=TextOptimizationMode.Clean)
# Removes signatures, headers, repetitive content

# Summarize Fast
fast_config = BrevitConfig(text_mode=TextOptimizationMode.SummarizeFast)
# Fast summarization (requires custom text optimizer implementation)

# Summarize High Quality
quality_config = BrevitConfig(text_mode=TextOptimizationMode.SummarizeHighQuality)
# High-quality summarization (requires custom text optimizer with LLM integration)

3. Image Optimization Examples

Example 3.1: Image from File (OCR)

# Read image file
with open('receipt.jpg', 'rb') as f:
    image_bytes = f.read()

# Brevit automatically detects bytes as image data
extracted_text = await brevit.brevity(image_bytes)
# Output: OCR-extracted text from the image

Example 3.2: Image from URL

import requests

# Fetch image from URL
response = requests.get('https://example.com/invoice.png')
image_bytes = response.content

# Optimize image
extracted_text = await brevit.brevity(image_bytes)

Example 3.3: Image Optimization Modes

# OCR Mode (Extract Text)
ocr_config = BrevitConfig(image_mode=ImageOptimizationMode.Ocr)
# Extracts text from images using OCR (requires custom image optimizer)

# Metadata Mode
metadata_config = BrevitConfig(image_mode=ImageOptimizationMode.Metadata)
# Extracts only image metadata (dimensions, format, etc.)

4. Method Comparison: .brevity() vs .optimize()

.brevity() - Automatic Strategy Selection

Use when: You want Brevit to automatically analyze and select the best optimization strategy.

# Automatically detects data type and applies optimal strategy
result = await brevit.brevity(data)
# - JSON objects → Flatten with tabular optimization
# - Long text → Text optimization
# - Images → OCR extraction

Advantages:

  • Zero configuration needed
  • Intelligent strategy selection
  • Works with any data type
  • Best for general-purpose use

.optimize() - Explicit Configuration

Use when: You want explicit control over optimization mode.

config = BrevitConfig(
    json_mode=JsonOptimizationMode.Flatten,
    text_mode=TextOptimizationMode.Clean,
    image_mode=ImageOptimizationMode.Ocr
)
brevit = BrevitClient(config)

# Uses explicit configuration
result = await brevit.optimize(data)

Advantages:

  • Full control over optimization
  • Predictable behavior
  • Best for specific use cases

5. Custom Optimizers

You can provide custom optimizers for text and images:

# Custom text optimizer
class CustomTextOptimizer:
    async def optimize_text(self, text: str, config: BrevitConfig) -> str:
        # Call your summarization service
        return await summarize_service.summarize(text)

# Custom image optimizer
class CustomImageOptimizer:
    async def optimize_image(self, image_data: bytes, config: BrevitConfig) -> str:
        # Call your OCR service (e.g., Azure AI Vision, Tesseract)
        return await ocr_service.extract_text(image_data)

brevit = BrevitClient(
    config,
    text_optimizer=CustomTextOptimizer(),
    image_optimizer=CustomImageOptimizer()
)

6. Complete Workflow Examples

Example 6.1: E-Commerce Order Processing

# Step 1: Optimize order JSON
order = {
    "orderId": "o-456",
    "customer": {"name": "John", "email": "john@example.com"},
    "items": [
        {"sku": "A-88", "quantity": 2, "price": 29.99},
        {"sku": "B-22", "quantity": 1, "price": 49.99}
    ]
}

optimized_order = await brevit.brevity(order)

# Step 2: Send to LLM
prompt = f"Analyze this order:\n\n{optimized_order}\n\nExtract total amount."
# Send prompt to OpenAI, Anthropic, etc.

Example 6.2: Document Processing Pipeline

# Step 1: Read and optimize text document
with open('contract.txt', 'r') as f:
    contract_text = f.read()

optimized_text = await brevit.brevity(contract_text)

# Step 2: Process with LLM
prompt = f"Summarize this contract:\n\n{optimized_text}"
# Send to LLM for summarization

Example 6.3: Receipt OCR Pipeline

# Step 1: Read receipt image
with open('receipt.jpg', 'rb') as f:
    receipt_image = f.read()

# Step 2: Extract text via OCR
extracted_text = await brevit.brevity(receipt_image)

# Step 3: Optimize extracted text (if it's long)
optimized = await brevit.brevity(extracted_text)

# Step 4: Send to LLM for analysis
prompt = f"Extract items and total from this receipt:\n\n{optimized}"
# Send to LLM

Flask/FastAPI Example

from flask import Flask, request, jsonify
from brevit import BrevitClient, BrevitConfig, JsonOptimizationMode

app = Flask(__name__)

# Initialize Brevit client
config = BrevitConfig(json_mode=JsonOptimizationMode.Flatten)
brevit = BrevitClient(config)

@app.route('/optimize', methods=['POST'])
async def optimize_data():
    data = request.json
    
    # Optimize the data
    optimized = await brevit.optimize(data)
    
    # Send to LLM API
    prompt = f"Context:\n{optimized}\n\nTask: Summarize the data."
    
    # response = await call_llm_api(prompt)
    
    return jsonify({"optimized": optimized, "prompt": prompt})

if __name__ == '__main__':
    app.run()

FastAPI Example

from fastapi import FastAPI
from brevit import BrevitClient, BrevitConfig, JsonOptimizationMode
from pydantic import BaseModel

app = FastAPI()

config = BrevitConfig(json_mode=JsonOptimizationMode.Flatten)
brevit = BrevitClient(config)

class OrderData(BaseModel):
    orderId: str
    status: str
    items: list

@app.post("/optimize")
async def optimize_order(order: OrderData):
    optimized = await brevit.optimize(order.dict())
    return {"optimized": optimized}

Configuration Options

BrevitConfig

config = BrevitConfig(
    json_mode=JsonOptimizationMode.Flatten,      # JSON optimization strategy
    text_mode=TextOptimizationMode.Clean,        # Text optimization strategy
    image_mode=ImageOptimizationMode.Ocr,        # Image optimization strategy
    json_paths_to_keep=[],                       # Paths to keep for Filter mode
    long_text_threshold=500                      # Character threshold for text optimization
)

JsonOptimizationMode

  • NONE: No optimization, pass JSON as-is
  • Flatten: Convert nested JSON to flat key-value pairs (most token-efficient)
  • ToYaml: Convert JSON to YAML format (requires PyYAML)
  • Filter: Keep only specified JSON paths

TextOptimizationMode

  • NONE: No optimization
  • Clean: Remove boilerplate and excessive whitespace
  • SummarizeFast: Use a fast model for summarization (requires custom ITextOptimizer)
  • SummarizeHighQuality: Use a high-quality model for summarization (requires custom ITextOptimizer)

ImageOptimizationMode

  • NONE: Skip image processing
  • Ocr: Extract text from images (requires custom IImageOptimizer)
  • Metadata: Extract basic metadata only

Advanced Usage

Custom Text Optimizer

Implement ITextOptimizer to use LangChain or your own LLM service:

from brevit import ITextOptimizer, BrevitConfig
from langchain.llms import OpenAI
from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate

class LangChainTextOptimizer:
    def __init__(self):
        self.llm = OpenAI(temperature=0)
        self.prompt = PromptTemplate(
            input_variables=["text"],
            template="Summarize the following text: {text}"
        )
        self.chain = LLMChain(llm=self.llm, prompt=self.prompt)

    async def optimize_text(self, long_text: str, config: BrevitConfig) -> str:
        result = await self.chain.arun(text=long_text)
        return result

# Use custom optimizer
config = BrevitConfig(text_mode=TextOptimizationMode.SummarizeFast)
brevit = BrevitClient(config, text_optimizer=LangChainTextOptimizer())

Custom Image Optimizer

Implement IImageOptimizer to use Azure AI Vision or Tesseract:

from brevit import IImageOptimizer, BrevitConfig
from azure.ai.vision.imageanalysis import ImageAnalysisClient
from azure.core.credentials import AzureKeyCredential

class AzureVisionImageOptimizer:
    def __init__(self, endpoint: str, key: str):
        self.client = ImageAnalysisClient(
            endpoint=endpoint,
            credential=AzureKeyCredential(key)
        )

    async def optimize_image(self, image_data: bytes, config: BrevitConfig) -> str:
        result = self.client.analyze(
            image_data=image_data,
            visual_features=["read"]
        )
        return result.read.text

# Use custom optimizer
config = BrevitConfig(image_mode=ImageOptimizationMode.Ocr)
brevit = BrevitClient(
    config,
    image_optimizer=AzureVisionImageOptimizer(endpoint="...", key="...")
)

Using Tesseract OCR

from brevit import IImageOptimizer, BrevitConfig
from PIL import Image
import pytesseract
import io

class TesseractImageOptimizer:
    async def optimize_image(self, image_data: bytes, config: BrevitConfig) -> str:
        image = Image.open(io.BytesIO(image_data))
        text = pytesseract.image_to_string(image)
        return text

config = BrevitConfig(image_mode=ImageOptimizationMode.Ocr)
brevit = BrevitClient(config, image_optimizer=TesseractImageOptimizer())

YAML Mode

To use YAML mode, install PyYAML:

pip install PyYAML

Then update the ToYaml case in brevit.py:

import yaml

# In the optimize method:
elif mode == JsonOptimizationMode.ToYaml:
    return yaml.dump(input_object)

Filter Mode

Use Filter mode to keep only specific JSON paths:

config = BrevitConfig(
    json_mode=JsonOptimizationMode.Filter,
    json_paths_to_keep=[
        "user.name",
        "order.orderId",
        "order.items[*].sku"
    ]
)

Examples

Example 1: Optimize Complex Object

user = {
    "id": "u-123",
    "name": "Javian",
    "isActive": True,
    "contact": {
        "email": "support@javianpicardo.com",
        "phone": None
    },
    "orders": [
        {"orderId": "o-456", "status": "SHIPPED"}
    ]
}

optimized = await brevit.optimize(user)
# Output:
# id: u-123
# name: Javian
# isActive: True
# contact.email: support@javianpicardo.com
# contact.phone: None
# orders[0].orderId: o-456
# orders[0].status: SHIPPED

Example 2: Optimize JSON String

json_str = """{
    "order": {
        "orderId": "o-456",
        "status": "SHIPPED",
        "items": [
            {"sku": "A-88", "name": "Brevit Pro", "quantity": 1}
        ]
    }
}"""

optimized = await brevit.optimize(json_str)

Example 3: Process Long Text

with open("document.txt", "r") as f:
    long_document = f.read()

optimized = await brevit.optimize(long_document)
# Will trigger text optimization if length > long_text_threshold

Example 4: Process Image

with open("receipt.jpg", "rb") as f:
    image_data = f.read()

optimized = await brevit.optimize(image_data)
# Will trigger image optimization

When Not to Use Brevit.py

Consider alternatives when:

  1. API Responses: If returning JSON to HTTP clients, use standard JSON
  2. Data Contracts: When strict JSON schema validation is required
  3. Small Objects: Objects under 100 tokens may not benefit significantly
  4. Real-Time APIs: For REST APIs serving JSON, standard formatting is better
  5. Database Storage: Databases expect standard JSON format

Best Use Cases:

  • ✅ LLM prompt optimization
  • ✅ Reducing OpenAI/Anthropic API costs
  • ✅ Processing large datasets for AI
  • ✅ Document summarization workflows
  • ✅ OCR and image processing pipelines
  • ✅ LangChain integrations

Benchmarks

Token Reduction

Object Type Original Tokens Brevit Tokens Reduction
Simple Dict 45 28 38%
Complex Dict 234 127 46%
Nested Lists 156 89 43%
API Response 312 178 43%

Performance

Operation Objects/sec Avg Latency Memory
Flatten (1KB) 1,600 0.6ms 2.1MB
Flatten (10KB) 380 2.6ms 8.5MB
Flatten (100KB) 48 21ms 45MB

Benchmarks: Python 3.11, Intel i7-12700K, asyncio

Playgrounds

Interactive Playground

# Clone and run
git clone https://github.com/JavianDev/Brevit.git
cd Brevit/Brevit.py
pip install -e .
python playground.py

Online Playground

CLI

Installation

pip install brevit-cli

Usage

# Optimize a JSON file
brevit optimize input.json -o output.txt

# Optimize from stdin
cat data.json | brevit optimize

# Optimize with custom config
brevit optimize input.json --mode flatten --threshold 1000

# Help
brevit --help

Examples

# Flatten JSON
brevit optimize order.json --mode flatten

# Convert to YAML
brevit optimize data.json --mode yaml

# Filter paths
brevit optimize data.json --mode filter --paths "user.name,order.id"

Format Overview

Flattened Format (Hybrid Optimization)

Brevit intelligently converts Python dictionaries to flat key-value pairs with automatic tabular optimization:

Input:

order = {
    "orderId": "o-456",
    "friends": ["ana", "luis", "sam"],
    "items": [
        {"sku": "A-88", "quantity": 1},
        {"sku": "T-22", "quantity": 2}
    ]
}

Output (with tabular optimization):

orderId: o-456
friends[3]: ana,luis,sam
items[2]{quantity,sku}:
  1,A-88
  2,T-22

For non-uniform arrays (fallback):

mixed = {
    "items": [
        {"sku": "A-88", "quantity": 1},
        "special-item",
        {"sku": "T-22", "quantity": 2}
    ]
}

Output (fallback to indexed format):

items[0].sku: A-88
items[0].quantity: 1
items[1]: special-item
items[2].sku: T-22
items[2].quantity: 2

Key Features

  • Dictionary Keys: Uses Python dictionary keys as-is
  • Nested Dicts: Dot notation for nested dictionaries
  • Tabular Arrays: Uniform object arrays automatically formatted in compact tabular format (items[2]{field1,field2}:)
  • Primitive Arrays: Comma-separated format (friends[3]: ana,luis,sam)
  • Hybrid Approach: Automatically detects optimal format, falls back to indexed format for mixed data
  • None Handling: Explicit None values
  • Type Preservation: Numbers, booleans preserved as strings

API

BrevitClient

Main client class for optimization.

class BrevitClient:
    def __init__(
        self,
        config: BrevitConfig,
        text_optimizer: Optional[ITextOptimizer] = None,
        image_optimizer: Optional[IImageOptimizer] = None,
    ):
    
    # Automatic optimization - analyzes data and selects best strategy
    async def brevity(self, raw_data: Any, intent: Optional[str] = None) -> str:
    
    # Explicit optimization with configured settings
    async def optimize(self, raw_data: Any, intent: Optional[str] = None) -> str:
    
    # Register custom optimization strategy
    def register_strategy(self, name: str, analyzer: Any, optimizer: Any) -> None:

Example - Automatic Optimization:

# Automatically analyzes data structure and selects best strategy
optimized = await brevit.brevity(order)
# Automatically detects uniform arrays, long text, etc.

Example - Explicit Optimization:

# Use explicit configuration
optimized = await brevit.optimize(order, "extract_total")

Example - Custom Strategy:

# Register custom optimization strategy
brevit.register_strategy('custom', custom_analyzer, custom_optimizer)

BrevitConfig

Configuration dataclass for BrevitClient.

@dataclass
class BrevitConfig:
    json_mode: JsonOptimizationMode = JsonOptimizationMode.Flatten
    text_mode: TextOptimizationMode = TextOptimizationMode.Clean
    image_mode: ImageOptimizationMode = ImageOptimizationMode.Ocr
    json_paths_to_keep: List[str] = field(default_factory=list)
    long_text_threshold: int = 500

Enums

JsonOptimizationMode

  • NONE - No optimization
  • Flatten - Flatten to key-value pairs (default)
  • ToYaml - Convert to YAML
  • Filter - Keep only specified paths

TextOptimizationMode

  • NONE - No optimization
  • Clean - Remove boilerplate
  • SummarizeFast - Fast summarization
  • SummarizeHighQuality - High-quality summarization

ImageOptimizationMode

  • NONE - Skip processing
  • Ocr - Extract text via OCR
  • Metadata - Extract metadata only

Using Brevit.py in LLM Prompts

Best Practices

  1. Context First: Provide context before optimized data
  2. Clear Instructions: Tell the LLM what format to expect
  3. Examples: Include format examples in prompts

Example Prompt Template

optimized = await brevit.optimize(order)

prompt = f"""You are analyzing order data. The data is in Brevit flattened format:

Context:
{optimized}

Task: Extract the order total and shipping address.

Format your response as JSON with keys: total, address"""

Real-World Example

async def analyze_order(order: dict):
    optimized = await brevit.optimize(order)
    
    prompt = f"""Analyze this order:

{optimized}

Questions:
1. What is the order total?
2. How many items?
3. Average item price?

Respond in JSON."""
    
    # Call OpenAI API
    response = await openai.ChatCompletion.acreate(
        model="gpt-4",
        messages=[{"role": "user", "content": prompt}]
    )
    
    return response.choices[0].message.content

Syntax Cheatsheet

Python to Brevit Format

Python Structure Brevit Format Example
Dictionary key key: value orderId: o-456
Nested key parent.child: value customer.name: John
Primitive list list[count]: val1,val2,val3 friends[3]: ana,luis,sam
Uniform object list list[count]{field1,field2}:
val1,val2
val3,val4
items[2]{sku,qty}:
A-88,1
T-22,2
List element (fallback) list[index].key: value items[0].sku: A-88
Nested list parent[index].child[index] orders[0].items[1].sku
None value key: None phone: None
Boolean key: True isActive: True
Number key: 123 quantity: 5

Special Cases

  • Empty Lists: items: []items: []
  • Empty Dicts: metadata: {}metadata: {}
  • None: Explicit None values
  • Datetime: Converted to ISO string
  • Tabular Arrays: Automatically detected when all dicts have same keys
  • Primitive Arrays: Automatically detected when all elements are primitives

Other Implementations

Brevit is available in multiple languages:

Language Package Status
Python brevit ✅ Stable (This)
C# (.NET) Brevit ✅ Stable
JavaScript brevit ✅ Stable

Full Specification

Format Specification

  1. Key-Value Pairs: One pair per line
  2. Separator: : (colon + space)
  3. Key Format: Dictionary keys with dot/bracket notation
  4. Value Format: String representation of values
  5. Line Endings: \n (newline)

Grammar

brevit := line*
line := key ": " value "\n"
key := identifier ("." identifier | "[" number "]")*
value := string | number | boolean | None
identifier := [a-zA-Z_][a-zA-Z0-9_]*

Examples

Simple Dict:

orderId: o-456
status: SHIPPED

Nested Dict:

customer.name: John Doe
customer.email: john@example.com

List:

items[0].sku: A-88
items[0].quantity: 1
items[1].sku: T-22
items[1].quantity: 2

Complex Structure:

orderId: o-456
customer.name: John Doe
items[0].sku: A-88
items[0].price: 29.99
items[1].sku: T-22
items[1].price: 39.99
shipping.address.street: 123 Main St
shipping.address.city: Toronto

Performance Considerations

  • Flatten Mode: Reduces token count by 40-60% compared to standard JSON
  • Async/Await: All operations are asynchronous for better scalability
  • Memory Efficient: Processes data in-place where possible
  • Type Hints: Full type annotations for better performance with type checkers

Best Practices

  1. Use Async/Await: Always use await when calling optimize()
  2. Implement Custom Optimizers: Replace default stubs with real LLM integrations
  3. Configure Thresholds: Adjust long_text_threshold based on your use case
  4. Monitor Token Usage: Track token counts before/after optimization
  5. Error Handling: Wrap optimize calls in try-except blocks
  6. Use Type Hints: Leverage type hints for better IDE support

Troubleshooting

Issue: "ToYaml mode requires 'pip install PyYAML'"

Solution: Install PyYAML: pip install PyYAML and update the code as shown in Advanced Usage.

Issue: Text summarization returns stub

Solution: Implement a custom ITextOptimizer using LangChain, Semantic Kernel, or your LLM service (see Advanced Usage).

Issue: Image OCR returns stub

Solution: Implement a custom IImageOptimizer using Azure AI Vision, Tesseract, or your OCR service (see Advanced Usage).

Issue: "Filter mode is not implemented"

Solution: Install jsonpath-ng and implement JSON path filtering logic.

Contributing

Contributions are welcome! Please read our contributing guidelines and submit pull requests to our GitHub repository.

License

MIT License - see LICENSE file for details.

Support

Version History

  • 0.1.0 (Current): Initial release with core optimization features

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

brevit-0.1.2.tar.gz (29.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

brevit-0.1.2-py3-none-any.whl (11.7 kB view details)

Uploaded Python 3

File details

Details for the file brevit-0.1.2.tar.gz.

File metadata

  • Download URL: brevit-0.1.2.tar.gz
  • Upload date:
  • Size: 29.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for brevit-0.1.2.tar.gz
Algorithm Hash digest
SHA256 f51a4d4f456ab9dff5d99af1daa808b57d2cd1f28652c1d343004ebca3c7d1c8
MD5 aeacbdf2c68ed69f0817372e4281a246
BLAKE2b-256 9e8b81c6f2dd493368d63807153a66a77be761f409122dbe9bfbba5797a114f2

See more details on using hashes here.

File details

Details for the file brevit-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: brevit-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 11.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for brevit-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 34538fb2791e3e6abef2e35b30f2ed4f5929332408c92c489d123a3285f1af7a
MD5 75f27bc192c1b79943424d47325886c1
BLAKE2b-256 dc7d8134e8e1221cd8ff6cd787841282fa0d2a8ce4d857bc57decb62615ceee3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page