Skip to main content

High-performance binary serialization format for Python

Project description

BTOON for Python

PyPI version Python Versions License: MIT

High-performance binary serialization format for Python applications with native C++ performance.

Features

  • 🚀 High Performance - Native C++ implementation with Python bindings
  • 📦 Compact Binary Format - Smaller than JSON, faster than Pickle
  • 🗜️ Built-in Compression - ZLIB, LZ4, ZSTD, Brotli, Snappy support
  • 📊 NumPy & Pandas Integration - Efficient array and DataFrame serialization
  • 🔄 Schema Evolution - Forward and backward compatibility
  • Zero-Copy Operations - Memory-efficient for large data
  • 🐍 Pythonic API - Native Python types and async support
  • 💰 Financial Types - Decimal, Currency, and Percentage support

Installation

pip install btoon

Or using conda:

conda install -c conda-forge btoon

Quick Start

import btoon

# Encode data
data = {
    'name': 'BTOON',
    'version': '0.0.1',
    'features': ['fast', 'compact', 'typed'],
    'metrics': {
        'speed': 9000,
        'size': 0.5
    }
}

encoded = btoon.encode(data)
print(f'Encoded size: {len(encoded)} bytes')

# Decode data
decoded = btoon.decode(encoded)
print(f'Decoded: {decoded}')

Advanced Features

Compression

# Enable compression with different algorithms
compressed = btoon.encode(data, compress=True)  # Default: zlib

# Specify algorithm and level
compressed = btoon.encode(data, 
    compress=True,
    algorithm='zstd',  # 'zlib', 'lz4', 'zstd', 'brotli', 'snappy'
    level=3
)

NumPy Integration

import numpy as np
import btoon

# Encode NumPy arrays efficiently
array = np.random.rand(1000, 100)
encoded = btoon.from_numpy(array)

# Decode back to NumPy
decoded = btoon.to_numpy(encoded)
assert np.array_equal(array, decoded)

Pandas Integration

import pandas as pd
import btoon

# Encode DataFrames with columnar optimization
df = pd.DataFrame({
    'id': range(1000),
    'name': [f'user_{i}' for i in range(1000)],
    'score': np.random.rand(1000)
})

encoded = btoon.from_dataframe(df)
print(f'Encoded size: {len(encoded)} bytes')

# Decode back to DataFrame
decoded_df = btoon.to_dataframe(encoded)
assert df.equals(decoded_df)

Extended Types

from btoon import Timestamp, Decimal, Currency, Percentage
from datetime import datetime

# Timestamp with nanosecond precision
ts = Timestamp.now()
ts_from_dt = Timestamp.from_datetime(datetime.now())

# Financial types with arbitrary precision
price = Decimal("19.99")
tax_rate = Percentage("8.25")
total = Currency("USD", price * (1 + tax_rate.to_decimal()))

data = {
    'timestamp': ts,
    'price': price,
    'tax_rate': tax_rate,
    'total': total
}

encoded = btoon.encode(data)
decoded = btoon.decode(encoded)

Async Support

import asyncio
import btoon

async def process_data():
    # Async encoding
    async with btoon.async_stream('output.btoon', 'w') as stream:
        await stream.encode({'chunk': 1})
        await stream.encode({'chunk': 2})
    
    # Async decoding
    async with btoon.async_stream('output.btoon', 'r') as stream:
        async for obj in stream:
            print(f'Decoded: {obj}')

asyncio.run(process_data())

File Operations

# Context manager for file operations
with btoon.open_btoon('data.btoon', 'w') as f:
    f.encode({'record': 1})
    f.encode({'record': 2})

with btoon.open_btoon('data.btoon', 'r') as f:
    for record in f:
        print(record)

Streaming

from btoon import StreamEncoder, StreamDecoder

# Stream encoding
encoder = StreamEncoder()
chunks = []
for i in range(100):
    chunk = encoder.encode_chunk({'id': i})
    chunks.append(chunk)
final = encoder.finalize()

# Stream decoding
decoder = StreamDecoder()
for chunk in chunks:
    objs = decoder.decode_chunk(chunk)
    for obj in objs:
        print(obj)

Schema Support

import btoon

# Define schema
schema = btoon.Schema({
    'type': 'object',
    'properties': {
        'id': {'type': 'integer', 'required': True},
        'name': {'type': 'string', 'required': True},
        'age': {'type': 'integer', 'min': 0, 'max': 120}
    }
})

# Validate and encode with schema
if schema.validate(data):
    encoded = btoon.encode(data, schema=schema)
    decoded = btoon.decode(encoded, schema=schema)

Performance

BTOON provides significant performance improvements:

Operation JSON Pickle BTOON Improvement
Encode 1MB 125ms 85ms 12ms 10x faster
Decode 1MB 95ms 45ms 8ms 11x faster
Size 1024KB 780KB 412KB 60% smaller

DataFrame Performance (10,000 rows × 20 columns)

Format Encode Decode Size
CSV 450ms 380ms 8.2MB
Pickle 120ms 95ms 5.1MB
Parquet 85ms 72ms 2.3MB
BTOON 35ms 28ms 1.8MB

API Reference

Core Functions

encode(data, compress=False, auto_tabular=True, **kwargs)

Encode Python data to BTOON format.

decode(data, decompress=False, **kwargs)

Decode BTOON data to Python objects.

NumPy Integration

from_numpy(array, compress=False)

Encode NumPy array to BTOON.

to_numpy(data)

Decode BTOON to NumPy array.

Pandas Integration

from_dataframe(df, compress=True)

Encode DataFrame to BTOON with columnar optimization.

to_dataframe(data)

Decode BTOON to DataFrame.

Extended Types

Timestamp

High-precision timestamp with nanoseconds and timezone.

Decimal

Arbitrary precision decimal numbers.

Currency

Currency values with code and amount.

Percentage

Percentage values with proper arithmetic.

Async Support

async_stream(path, mode='r')

Async context manager for file operations.

AsyncStreamEncoder

Async encoder for streaming data.

AsyncStreamDecoder

Async decoder for streaming data.

Examples

See the examples/ directory for more usage examples:

  • Basic encoding/decoding
  • NumPy and Pandas integration
  • Financial calculations
  • Async operations
  • Schema validation
  • Performance benchmarks

Requirements

  • Python >= 3.7
  • C++ compiler (for building from source)
  • NumPy (optional, for array support)
  • Pandas (optional, for DataFrame support)

Building from Source

git clone https://github.com/BTOON-project/btoon-python.git
cd btoon-python
pip install -e .
pytest tests/

Contributing

Contributions are welcome! Please read our Contributing Guide for details.

License

MIT License - see LICENSE file for details.

Links


Part of the BTOON project - High-performance binary serialization for modern applications.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

btoon-0.0.1.tar.gz (226.3 kB view details)

Uploaded Source

File details

Details for the file btoon-0.0.1.tar.gz.

File metadata

  • Download URL: btoon-0.0.1.tar.gz
  • Upload date:
  • Size: 226.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for btoon-0.0.1.tar.gz
Algorithm Hash digest
SHA256 cdeddd1eb62253b9eedfaeab940b91a34e032a645f1916485ebe61c96dcee9e1
MD5 53afbe6980f34c7171d8dbb692d87a33
BLAKE2b-256 1b4c3ddb9f079474353843b7053938321fc97bd72d4ad9938c0841db738a5339

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page