A high-performance Python library for semantically compressing and optimizing data before sending it to a Large Language Model (LLM).
Project description
Brevit.py
A high-performance Python library for semantically compressing and optimizing data before sending it to a Large Language Model (LLM). Dramatically reduce token costs while maintaining data integrity and readability.
Table of Contents
- Why Brevit.py?
- Key Features
- When Not to Use Brevit.py
- Benchmarks
- Installation & Quick Start
- Playgrounds
- CLI
- Format Overview
- API
- Using Brevit.py in LLM Prompts
- Syntax Cheatsheet
- Other Implementations
- Full Specification
Why Brevit.py?
Python-Specific Advantages
- Async/Await: Built with modern Python async/await patterns
- Type Hints: Full type annotations for better IDE support
- LangChain Integration: Ready for LangChain workflows
- FastAPI/Flask Compatible: Works seamlessly with popular web frameworks
- Pydantic Support: Integrates with Pydantic models
Performance Benefits
- 40-60% Token Reduction: Dramatically reduce LLM API costs
- Async Operations: Non-blocking I/O for better concurrency
- Memory Efficient: Processes data in-place where possible
- Fast Execution: Optimized algorithms for minimal overhead
Example Cost Savings
# Before: 234 tokens = $0.000468 per request
json_str = json.dumps(complex_order)
# After: 127 tokens = $0.000254 per request (46% reduction)
optimized = await brevit.brevity(complex_order) # Automatic optimization
# Or with explicit configuration
explicit = await brevit.optimize(complex_order)
# Savings: $0.000214 per request
# At 1M requests/month: $214/month savings
Automatic Strategy Selection
Brevit.py now includes the .brevity() method that automatically analyzes your data and selects the optimal optimization strategy:
data = {
"friends": ["ana", "luis", "sam"],
"hikes": [
{"id": 1, "name": "Blue Lake Trail", "distanceKm": 7.5},
{"id": 2, "name": "Ridge Overlook", "distanceKm": 9.2}
]
}
# Automatically detects uniform arrays and applies tabular format
optimized = await brevit.brevity(data)
# No configuration needed - Brevit analyzes and optimizes automatically!
Key Features
- JSON Optimization: Flatten nested JSON structures into token-efficient key-value pairs
- Text Optimization: Clean and summarize long text documents
- Image Optimization: Extract text from images via OCR
- Async/Await: Built with modern Python async/await patterns
- Extensible: Plugin architecture for custom optimizers
- Lightweight: Minimal dependencies, high performance
- Type Hints: Full type annotations for better IDE support
Installation
Prerequisites
- Python 3.8 or later
- pip or poetry
Install via pip
pip install brevit
Install from Source
- Clone the repository:
git clone https://github.com/JavianDev/Brevit.py.git
cd Brevit.py
- Install in development mode:
pip install -e .
Optional Dependencies
For YAML support:
pip install brevit[yaml]
# or
pip install PyYAML
For JSON path filtering:
pip install brevit[jsonpath]
# or
pip install jsonpath-ng
Quick Start
Basic Usage
from brevit import BrevitClient, BrevitConfig, JsonOptimizationMode
import asyncio
async def main():
# 1. Create configuration
config = BrevitConfig(
json_mode=JsonOptimizationMode.Flatten,
text_mode=TextOptimizationMode.Clean,
image_mode=ImageOptimizationMode.Ocr,
long_text_threshold=1000 # Summarize text over 1000 chars
)
# 2. Create client
brevit = BrevitClient(config)
# 3. Optimize data
order = {
"orderId": "o-456",
"status": "SHIPPED",
"items": [
{"sku": "A-88", "name": "Brevit Pro License", "quantity": 1}
]
}
optimized = await brevit.optimize(order)
# Result (with abbreviations enabled by default):
# "@o=order\n@o.orderId:o-456\n@o.status:SHIPPED\n@o.items[1]{name,quantity,sku}:\nBrevit Pro License,1,A-88"
print(optimized)
asyncio.run(main())
Abbreviation Feature (New in v0.1.2)
Brevit automatically creates abbreviations for frequently repeated prefixes, reducing token usage by 10-25%:
from brevit import BrevitClient, BrevitConfig, JsonOptimizationMode
config = BrevitConfig(
json_mode=JsonOptimizationMode.Flatten,
enable_abbreviations=True, # Enabled by default
abbreviation_threshold=2 # Minimum occurrences to abbreviate
)
brevit = BrevitClient(config)
data = {
"user": {
"name": "John Doe",
"email": "john@example.com",
"age": 30
},
"order": {
"id": "o-456",
"status": "SHIPPED"
}
}
optimized = await brevit.brevity(data)
# Output with abbreviations:
# @u=user
# @o=order
# @u.name:John Doe
# @u.email:john@example.com
# @u.age:30
# @o.id:o-456
# @o.status:SHIPPED
Token Savings: The abbreviation feature reduces tokens by replacing repeated prefixes like "user." and "order." with short aliases like "@u" and "@o", saving 10-25% on typical nested JSON structures.
Complete Usage Examples
Brevit.py supports three main data types: JSON objects/strings, text files/strings, and images. Here's how to use each:
1. JSON Optimization Examples
Example 1.1: Simple JSON Object
from brevit import BrevitClient, BrevitConfig, JsonOptimizationMode
brevit = BrevitClient(BrevitConfig(json_mode=JsonOptimizationMode.Flatten))
data = {
"user": {
"name": "John Doe",
"email": "john@example.com",
"age": 30
}
}
# Method 1: Automatic optimization (recommended)
optimized = await brevit.brevity(data)
# Output:
# user.name:John Doe
# user.email:john@example.com
# user.age:30
# Method 2: Explicit optimization
explicit = await brevit.optimize(data)
Example 1.2: JSON String
json_string = '{"order": {"id": "o-456", "status": "SHIPPED"}}'
# Brevit automatically detects JSON strings
optimized = await brevit.brevity(json_string)
# Output:
# order.id:o-456
# order.status:SHIPPED
Example 1.3: Complex Nested JSON with Arrays
complex_data = {
"context": {
"task": "Our favorite hikes together",
"location": "Boulder",
"season": "spring_2025"
},
"friends": ["ana", "luis", "sam"],
"hikes": [
{
"id": 1,
"name": "Blue Lake Trail",
"distanceKm": 7.5,
"elevationGain": 320,
"companion": "ana",
"wasSunny": True
},
{
"id": 2,
"name": "Ridge Overlook",
"distanceKm": 9.2,
"elevationGain": 540,
"companion": "luis",
"wasSunny": False
}
]
}
optimized = await brevit.brevity(complex_data)
# Output:
# context.task:Our favorite hikes together
# context.location:Boulder
# context.season:spring_2025
# friends[3]:ana,luis,sam
# hikes[2]{companion,distanceKm,elevationGain,id,name,wasSunny}:
# ana,7.5,320,1,Blue Lake Trail,true
# luis,9.2,540,2,Ridge Overlook,false
Example 1.4: Different JSON Optimization Modes
# Flatten Mode (Default)
flatten_config = BrevitConfig(json_mode=JsonOptimizationMode.Flatten)
# Converts nested JSON to flat key-value pairs
# YAML Mode
yaml_config = BrevitConfig(json_mode=JsonOptimizationMode.ToYaml)
# Converts JSON to YAML format (requires PyYAML)
# Filter Mode
filter_config = BrevitConfig(
json_mode=JsonOptimizationMode.Filter,
json_paths_to_keep=["user.name", "order.id"]
)
# Keeps only specified paths, removes everything else
2. Text Optimization Examples
Example 2.1: Long Text String
long_text = "This is a very long document..." * 100
config = BrevitConfig(
json_mode=JsonOptimizationMode.None,
text_mode=TextOptimizationMode.Clean,
long_text_threshold=500
)
brevit = BrevitClient(config)
# Automatic detection
optimized = await brevit.brevity(long_text)
# Explicit text optimization
cleaned = await brevit.optimize(long_text)
Example 2.2: Reading Text from File
# Read text file
with open('document.txt', 'r', encoding='utf-8') as f:
text_content = f.read()
# Optimize the text
optimized = await brevit.brevity(text_content)
Example 2.3: Text Optimization Modes
# Clean Mode (Remove Boilerplate)
clean_config = BrevitConfig(text_mode=TextOptimizationMode.Clean)
# Removes signatures, headers, repetitive content
# Summarize Fast
fast_config = BrevitConfig(text_mode=TextOptimizationMode.SummarizeFast)
# Fast summarization (requires custom text optimizer implementation)
# Summarize High Quality
quality_config = BrevitConfig(text_mode=TextOptimizationMode.SummarizeHighQuality)
# High-quality summarization (requires custom text optimizer with LLM integration)
3. Image Optimization Examples
Example 3.1: Image from File (OCR)
# Read image file
with open('receipt.jpg', 'rb') as f:
image_bytes = f.read()
# Brevit automatically detects bytes as image data
extracted_text = await brevit.brevity(image_bytes)
# Output: OCR-extracted text from the image
Example 3.2: Image from URL
import requests
# Fetch image from URL
response = requests.get('https://example.com/invoice.png')
image_bytes = response.content
# Optimize image
extracted_text = await brevit.brevity(image_bytes)
Example 3.3: Image Optimization Modes
# OCR Mode (Extract Text)
ocr_config = BrevitConfig(image_mode=ImageOptimizationMode.Ocr)
# Extracts text from images using OCR (requires custom image optimizer)
# Metadata Mode
metadata_config = BrevitConfig(image_mode=ImageOptimizationMode.Metadata)
# Extracts only image metadata (dimensions, format, etc.)
4. Method Comparison: .brevity() vs .optimize()
.brevity() - Automatic Strategy Selection
Use when: You want Brevit to automatically analyze and select the best optimization strategy.
# Automatically detects data type and applies optimal strategy
result = await brevit.brevity(data)
# - JSON objects → Flatten with tabular optimization
# - Long text → Text optimization
# - Images → OCR extraction
Advantages:
- Zero configuration needed
- Intelligent strategy selection
- Works with any data type
- Best for general-purpose use
.optimize() - Explicit Configuration
Use when: You want explicit control over optimization mode.
config = BrevitConfig(
json_mode=JsonOptimizationMode.Flatten,
text_mode=TextOptimizationMode.Clean,
image_mode=ImageOptimizationMode.Ocr
)
brevit = BrevitClient(config)
# Uses explicit configuration
result = await brevit.optimize(data)
Advantages:
- Full control over optimization
- Predictable behavior
- Best for specific use cases
5. Custom Optimizers
You can provide custom optimizers for text and images:
# Custom text optimizer
class CustomTextOptimizer:
async def optimize_text(self, text: str, config: BrevitConfig) -> str:
# Call your summarization service
return await summarize_service.summarize(text)
# Custom image optimizer
class CustomImageOptimizer:
async def optimize_image(self, image_data: bytes, config: BrevitConfig) -> str:
# Call your OCR service (e.g., Azure AI Vision, Tesseract)
return await ocr_service.extract_text(image_data)
brevit = BrevitClient(
config,
text_optimizer=CustomTextOptimizer(),
image_optimizer=CustomImageOptimizer()
)
6. Complete Workflow Examples
Example 6.1: E-Commerce Order Processing
# Step 1: Optimize order JSON
order = {
"orderId": "o-456",
"customer": {"name": "John", "email": "john@example.com"},
"items": [
{"sku": "A-88", "quantity": 2, "price": 29.99},
{"sku": "B-22", "quantity": 1, "price": 49.99}
]
}
optimized_order = await brevit.brevity(order)
# Step 2: Send to LLM
prompt = f"Analyze this order:\n\n{optimized_order}\n\nExtract total amount."
# Send prompt to OpenAI, Anthropic, etc.
Example 6.2: Document Processing Pipeline
# Step 1: Read and optimize text document
with open('contract.txt', 'r') as f:
contract_text = f.read()
optimized_text = await brevit.brevity(contract_text)
# Step 2: Process with LLM
prompt = f"Summarize this contract:\n\n{optimized_text}"
# Send to LLM for summarization
Example 6.3: Receipt OCR Pipeline
# Step 1: Read receipt image
with open('receipt.jpg', 'rb') as f:
receipt_image = f.read()
# Step 2: Extract text via OCR
extracted_text = await brevit.brevity(receipt_image)
# Step 3: Optimize extracted text (if it's long)
optimized = await brevit.brevity(extracted_text)
# Step 4: Send to LLM for analysis
prompt = f"Extract items and total from this receipt:\n\n{optimized}"
# Send to LLM
Flask/FastAPI Example
from flask import Flask, request, jsonify
from brevit import BrevitClient, BrevitConfig, JsonOptimizationMode
app = Flask(__name__)
# Initialize Brevit client
config = BrevitConfig(json_mode=JsonOptimizationMode.Flatten)
brevit = BrevitClient(config)
@app.route('/optimize', methods=['POST'])
async def optimize_data():
data = request.json
# Optimize the data
optimized = await brevit.optimize(data)
# Send to LLM API
prompt = f"Context:\n{optimized}\n\nTask: Summarize the data."
# response = await call_llm_api(prompt)
return jsonify({"optimized": optimized, "prompt": prompt})
if __name__ == '__main__':
app.run()
FastAPI Example
from fastapi import FastAPI
from brevit import BrevitClient, BrevitConfig, JsonOptimizationMode
from pydantic import BaseModel
app = FastAPI()
config = BrevitConfig(json_mode=JsonOptimizationMode.Flatten)
brevit = BrevitClient(config)
class OrderData(BaseModel):
orderId: str
status: str
items: list
@app.post("/optimize")
async def optimize_order(order: OrderData):
optimized = await brevit.optimize(order.dict())
return {"optimized": optimized}
Configuration Options
BrevitConfig
config = BrevitConfig(
json_mode=JsonOptimizationMode.Flatten, # JSON optimization strategy
text_mode=TextOptimizationMode.Clean, # Text optimization strategy
image_mode=ImageOptimizationMode.Ocr, # Image optimization strategy
json_paths_to_keep=[], # Paths to keep for Filter mode
long_text_threshold=500 # Character threshold for text optimization
)
JsonOptimizationMode
- NONE: No optimization, pass JSON as-is
- Flatten: Convert nested JSON to flat key-value pairs (most token-efficient)
- ToYaml: Convert JSON to YAML format (requires PyYAML)
- Filter: Keep only specified JSON paths
TextOptimizationMode
- NONE: No optimization
- Clean: Remove boilerplate and excessive whitespace
- SummarizeFast: Use a fast model for summarization (requires custom ITextOptimizer)
- SummarizeHighQuality: Use a high-quality model for summarization (requires custom ITextOptimizer)
ImageOptimizationMode
- NONE: Skip image processing
- Ocr: Extract text from images (requires custom IImageOptimizer)
- Metadata: Extract basic metadata only
Advanced Usage
Custom Text Optimizer
Implement ITextOptimizer to use LangChain or your own LLM service:
from brevit import ITextOptimizer, BrevitConfig
from langchain.llms import OpenAI
from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate
class LangChainTextOptimizer:
def __init__(self):
self.llm = OpenAI(temperature=0)
self.prompt = PromptTemplate(
input_variables=["text"],
template="Summarize the following text: {text}"
)
self.chain = LLMChain(llm=self.llm, prompt=self.prompt)
async def optimize_text(self, long_text: str, config: BrevitConfig) -> str:
result = await self.chain.arun(text=long_text)
return result
# Use custom optimizer
config = BrevitConfig(text_mode=TextOptimizationMode.SummarizeFast)
brevit = BrevitClient(config, text_optimizer=LangChainTextOptimizer())
Custom Image Optimizer
Implement IImageOptimizer to use Azure AI Vision or Tesseract:
from brevit import IImageOptimizer, BrevitConfig
from azure.ai.vision.imageanalysis import ImageAnalysisClient
from azure.core.credentials import AzureKeyCredential
class AzureVisionImageOptimizer:
def __init__(self, endpoint: str, key: str):
self.client = ImageAnalysisClient(
endpoint=endpoint,
credential=AzureKeyCredential(key)
)
async def optimize_image(self, image_data: bytes, config: BrevitConfig) -> str:
result = self.client.analyze(
image_data=image_data,
visual_features=["read"]
)
return result.read.text
# Use custom optimizer
config = BrevitConfig(image_mode=ImageOptimizationMode.Ocr)
brevit = BrevitClient(
config,
image_optimizer=AzureVisionImageOptimizer(endpoint="...", key="...")
)
Using Tesseract OCR
from brevit import IImageOptimizer, BrevitConfig
from PIL import Image
import pytesseract
import io
class TesseractImageOptimizer:
async def optimize_image(self, image_data: bytes, config: BrevitConfig) -> str:
image = Image.open(io.BytesIO(image_data))
text = pytesseract.image_to_string(image)
return text
config = BrevitConfig(image_mode=ImageOptimizationMode.Ocr)
brevit = BrevitClient(config, image_optimizer=TesseractImageOptimizer())
YAML Mode
To use YAML mode, install PyYAML:
pip install PyYAML
Then update the ToYaml case in brevit.py:
import yaml
# In the optimize method:
elif mode == JsonOptimizationMode.ToYaml:
return yaml.dump(input_object)
Filter Mode
Use Filter mode to keep only specific JSON paths:
config = BrevitConfig(
json_mode=JsonOptimizationMode.Filter,
json_paths_to_keep=[
"user.name",
"order.orderId",
"order.items[*].sku"
]
)
Examples
Example 1: Optimize Complex Object
user = {
"id": "u-123",
"name": "Javian",
"isActive": True,
"contact": {
"email": "support@javianpicardo.com",
"phone": None
},
"orders": [
{"orderId": "o-456", "status": "SHIPPED"}
]
}
optimized = await brevit.optimize(user)
# Output:
# id: u-123
# name: Javian
# isActive: True
# contact.email: support@javianpicardo.com
# contact.phone: None
# orders[0].orderId: o-456
# orders[0].status: SHIPPED
Example 2: Optimize JSON String
json_str = """{
"order": {
"orderId": "o-456",
"status": "SHIPPED",
"items": [
{"sku": "A-88", "name": "Brevit Pro", "quantity": 1}
]
}
}"""
optimized = await brevit.optimize(json_str)
Example 3: Process Long Text
with open("document.txt", "r") as f:
long_document = f.read()
optimized = await brevit.optimize(long_document)
# Will trigger text optimization if length > long_text_threshold
Example 4: Process Image
with open("receipt.jpg", "rb") as f:
image_data = f.read()
optimized = await brevit.optimize(image_data)
# Will trigger image optimization
When Not to Use Brevit.py
Consider alternatives when:
- API Responses: If returning JSON to HTTP clients, use standard JSON
- Data Contracts: When strict JSON schema validation is required
- Small Objects: Objects under 100 tokens may not benefit significantly
- Real-Time APIs: For REST APIs serving JSON, standard formatting is better
- Database Storage: Databases expect standard JSON format
Best Use Cases:
- ✅ LLM prompt optimization
- ✅ Reducing OpenAI/Anthropic API costs
- ✅ Processing large datasets for AI
- ✅ Document summarization workflows
- ✅ OCR and image processing pipelines
- ✅ LangChain integrations
Benchmarks
Token Reduction
| Object Type | Original Tokens | Brevit Tokens | Reduction |
|---|---|---|---|
| Simple Dict | 45 | 28 | 38% |
| Complex Dict | 234 | 127 | 46% |
| Nested Lists | 156 | 89 | 43% |
| API Response | 312 | 178 | 43% |
Performance
| Operation | Objects/sec | Avg Latency | Memory |
|---|---|---|---|
| Flatten (1KB) | 1,600 | 0.6ms | 2.1MB |
| Flatten (10KB) | 380 | 2.6ms | 8.5MB |
| Flatten (100KB) | 48 | 21ms | 45MB |
Benchmarks: Python 3.11, Intel i7-12700K, asyncio
Playgrounds
Interactive Playground
# Clone and run
git clone https://github.com/JavianDev/Brevit.git
cd Brevit/Brevit.py
pip install -e .
python playground.py
Online Playground
- Web Playground: https://brevit.dev/playground (Coming Soon)
- Replit: https://replit.com/@brevit/playground (Coming Soon)
- Colab: https://colab.research.google.com/brevit (Coming Soon)
CLI
Installation
pip install brevit-cli
Usage
# Optimize a JSON file
brevit optimize input.json -o output.txt
# Optimize from stdin
cat data.json | brevit optimize
# Optimize with custom config
brevit optimize input.json --mode flatten --threshold 1000
# Help
brevit --help
Examples
# Flatten JSON
brevit optimize order.json --mode flatten
# Convert to YAML
brevit optimize data.json --mode yaml
# Filter paths
brevit optimize data.json --mode filter --paths "user.name,order.id"
Format Overview
Flattened Format (Hybrid Optimization)
Brevit intelligently converts Python dictionaries to flat key-value pairs with automatic tabular optimization:
Input:
order = {
"orderId": "o-456",
"friends": ["ana", "luis", "sam"],
"items": [
{"sku": "A-88", "quantity": 1},
{"sku": "T-22", "quantity": 2}
]
}
Output (with tabular optimization):
orderId: o-456
friends[3]: ana,luis,sam
items[2]{quantity,sku}:
1,A-88
2,T-22
For non-uniform arrays (fallback):
mixed = {
"items": [
{"sku": "A-88", "quantity": 1},
"special-item",
{"sku": "T-22", "quantity": 2}
]
}
Output (fallback to indexed format):
items[0].sku: A-88
items[0].quantity: 1
items[1]: special-item
items[2].sku: T-22
items[2].quantity: 2
Key Features
- Dictionary Keys: Uses Python dictionary keys as-is
- Nested Dicts: Dot notation for nested dictionaries
- Tabular Arrays: Uniform object arrays automatically formatted in compact tabular format (
items[2]{field1,field2}:) - Primitive Arrays: Comma-separated format (
friends[3]: ana,luis,sam) - Hybrid Approach: Automatically detects optimal format, falls back to indexed format for mixed data
- None Handling: Explicit
Nonevalues - Type Preservation: Numbers, booleans preserved as strings
API
BrevitClient
Main client class for optimization.
class BrevitClient:
def __init__(
self,
config: BrevitConfig,
text_optimizer: Optional[ITextOptimizer] = None,
image_optimizer: Optional[IImageOptimizer] = None,
):
# Automatic optimization - analyzes data and selects best strategy
async def brevity(self, raw_data: Any, intent: Optional[str] = None) -> str:
# Explicit optimization with configured settings
async def optimize(self, raw_data: Any, intent: Optional[str] = None) -> str:
# Register custom optimization strategy
def register_strategy(self, name: str, analyzer: Any, optimizer: Any) -> None:
Example - Automatic Optimization:
# Automatically analyzes data structure and selects best strategy
optimized = await brevit.brevity(order)
# Automatically detects uniform arrays, long text, etc.
Example - Explicit Optimization:
# Use explicit configuration
optimized = await brevit.optimize(order, "extract_total")
Example - Custom Strategy:
# Register custom optimization strategy
brevit.register_strategy('custom', custom_analyzer, custom_optimizer)
BrevitConfig
Configuration dataclass for BrevitClient.
@dataclass
class BrevitConfig:
json_mode: JsonOptimizationMode = JsonOptimizationMode.Flatten
text_mode: TextOptimizationMode = TextOptimizationMode.Clean
image_mode: ImageOptimizationMode = ImageOptimizationMode.Ocr
json_paths_to_keep: List[str] = field(default_factory=list)
long_text_threshold: int = 500
Enums
JsonOptimizationMode
NONE- No optimizationFlatten- Flatten to key-value pairs (default)ToYaml- Convert to YAMLFilter- Keep only specified paths
TextOptimizationMode
NONE- No optimizationClean- Remove boilerplateSummarizeFast- Fast summarizationSummarizeHighQuality- High-quality summarization
ImageOptimizationMode
NONE- Skip processingOcr- Extract text via OCRMetadata- Extract metadata only
Using Brevit.py in LLM Prompts
Best Practices
- Context First: Provide context before optimized data
- Clear Instructions: Tell the LLM what format to expect
- Examples: Include format examples in prompts
Example Prompt Template
optimized = await brevit.optimize(order)
prompt = f"""You are analyzing order data. The data is in Brevit flattened format:
Context:
{optimized}
Task: Extract the order total and shipping address.
Format your response as JSON with keys: total, address"""
Real-World Example
async def analyze_order(order: dict):
optimized = await brevit.optimize(order)
prompt = f"""Analyze this order:
{optimized}
Questions:
1. What is the order total?
2. How many items?
3. Average item price?
Respond in JSON."""
# Call OpenAI API
response = await openai.ChatCompletion.acreate(
model="gpt-4",
messages=[{"role": "user", "content": prompt}]
)
return response.choices[0].message.content
Syntax Cheatsheet
Python to Brevit Format
| Python Structure | Brevit Format | Example |
|---|---|---|
| Dictionary key | key: value |
orderId: o-456 |
| Nested key | parent.child: value |
customer.name: John |
| Primitive list | list[count]: val1,val2,val3 |
friends[3]: ana,luis,sam |
| Uniform object list | list[count]{field1,field2}: val1,val2 val3,val4 |
items[2]{sku,qty}: A-88,1 T-22,2 |
| List element (fallback) | list[index].key: value |
items[0].sku: A-88 |
| Nested list | parent[index].child[index] |
orders[0].items[1].sku |
| None value | key: None |
phone: None |
| Boolean | key: True |
isActive: True |
| Number | key: 123 |
quantity: 5 |
Special Cases
- Empty Lists:
items: []→items: [] - Empty Dicts:
metadata: {}→metadata: {} - None: Explicit
Nonevalues - Datetime: Converted to ISO string
- Tabular Arrays: Automatically detected when all dicts have same keys
- Primitive Arrays: Automatically detected when all elements are primitives
Other Implementations
Brevit is available in multiple languages:
| Language | Package | Status |
|---|---|---|
| Python | brevit |
✅ Stable (This) |
| C# (.NET) | Brevit |
✅ Stable |
| JavaScript | brevit |
✅ Stable |
Full Specification
Format Specification
- Key-Value Pairs: One pair per line
- Separator:
:(colon + space) - Key Format: Dictionary keys with dot/bracket notation
- Value Format: String representation of values
- Line Endings:
\n(newline)
Grammar
brevit := line*
line := key ": " value "\n"
key := identifier ("." identifier | "[" number "]")*
value := string | number | boolean | None
identifier := [a-zA-Z_][a-zA-Z0-9_]*
Examples
Simple Dict:
orderId: o-456
status: SHIPPED
Nested Dict:
customer.name: John Doe
customer.email: john@example.com
List:
items[0].sku: A-88
items[0].quantity: 1
items[1].sku: T-22
items[1].quantity: 2
Complex Structure:
orderId: o-456
customer.name: John Doe
items[0].sku: A-88
items[0].price: 29.99
items[1].sku: T-22
items[1].price: 39.99
shipping.address.street: 123 Main St
shipping.address.city: Toronto
Performance Considerations
- Flatten Mode: Reduces token count by 40-60% compared to standard JSON
- Async/Await: All operations are asynchronous for better scalability
- Memory Efficient: Processes data in-place where possible
- Type Hints: Full type annotations for better performance with type checkers
Best Practices
- Use Async/Await: Always use
awaitwhen callingoptimize() - Implement Custom Optimizers: Replace default stubs with real LLM integrations
- Configure Thresholds: Adjust
long_text_thresholdbased on your use case - Monitor Token Usage: Track token counts before/after optimization
- Error Handling: Wrap optimize calls in try-except blocks
- Use Type Hints: Leverage type hints for better IDE support
Troubleshooting
Issue: "ToYaml mode requires 'pip install PyYAML'"
Solution: Install PyYAML: pip install PyYAML and update the code as shown in Advanced Usage.
Issue: Text summarization returns stub
Solution: Implement a custom ITextOptimizer using LangChain, Semantic Kernel, or your LLM service (see Advanced Usage).
Issue: Image OCR returns stub
Solution: Implement a custom IImageOptimizer using Azure AI Vision, Tesseract, or your OCR service (see Advanced Usage).
Issue: "Filter mode is not implemented"
Solution: Install jsonpath-ng and implement JSON path filtering logic.
Contributing
Contributions are welcome! Please read our contributing guidelines and submit pull requests to our GitHub repository.
License
MIT License - see LICENSE file for details.
Support
- Documentation: https://brevit.dev/docs
- Issues: https://github.com/JavianDev/Brevit.py/issues
- Email: support@javianpicardo.com
Version History
- 0.1.0 (Current): Initial release with core optimization features
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file brevit-0.1.2.tar.gz.
File metadata
- Download URL: brevit-0.1.2.tar.gz
- Upload date:
- Size: 29.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f51a4d4f456ab9dff5d99af1daa808b57d2cd1f28652c1d343004ebca3c7d1c8
|
|
| MD5 |
aeacbdf2c68ed69f0817372e4281a246
|
|
| BLAKE2b-256 |
9e8b81c6f2dd493368d63807153a66a77be761f409122dbe9bfbba5797a114f2
|
File details
Details for the file brevit-0.1.2-py3-none-any.whl.
File metadata
- Download URL: brevit-0.1.2-py3-none-any.whl
- Upload date:
- Size: 11.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
34538fb2791e3e6abef2e35b30f2ed4f5929332408c92c489d123a3285f1af7a
|
|
| MD5 |
75f27bc192c1b79943424d47325886c1
|
|
| BLAKE2b-256 |
dc7d8134e8e1221cd8ff6cd787841282fa0d2a8ce4d857bc57decb62615ceee3
|