Skip to main content

High-performance lzstring compression library compatible with JavaScript implementation

Project description

lean4url

PyPI version Python Version License: MIT Tests Coverage

A high-performance lzstring compression library fully compatible with JavaScript implementation.

Features

Fully Compatible - 100% compatible with pieroxy/lz-string JavaScript implementation

Unicode Support - Correctly handles all Unicode characters, including emoji and special symbols

URL Friendly - Built-in URL encoding/decoding functionality

High Performance - Optimized algorithm implementation

Type Safe - Complete type annotation support

Thoroughly Tested - Includes comparative tests with JavaScript version

Background

Existing Python lzstring packages have issues with Unicode character handling. For example, for the character "𝔓":

  • Existing package output: sirQ
  • JavaScript original output: qwbmRdo=
  • lean4url output: qwbmRdo=

lean4url solves this problem by correctly simulating JavaScript's UTF-16 encoding behavior.

Installation

pip install lean4url

Quick Start

Basic Compression/Decompression

from lean4url import LZString

# Create instance
lz = LZString()

# Compress string
original = "Hello, 世界! 🌍"
compressed = lz.compress_to_base64(original)
print(f"Compressed: {compressed}")

# Decompress string
decompressed = lz.decompress_from_base64(compressed)
print(f"Decompressed: {decompressed}")
# Output: Hello, 世界! 🌍

URL Encoding/Decoding

from lean4url import encode_url, decode_url

# Encode data to URL
data = "This is data to be encoded"
url = encode_url(data, base_url="https://example.com/share")
print(f"Encoded URL: {url}")
# Output: https://example.com/share/#codez=BIUwNmD2A0AEDukBOYAmBMYAZhAY...

# Decode data from URL
result = decode_url(url)
print(f"Decoded result: {result['codez']}")
# Output: This is data to be encoded

URL Encoding with Parameters

from lean4url import encode_url, decode_url

# Add extra parameters when encoding
code = "function hello() { return 'world'; }"
url = encode_url(
    code, 
    base_url="https://playground.example.com",
    lang="javascript",
    theme="dark",
    url="https://docs.example.com"  # This parameter will be URL encoded
)

print(f"Complete URL: {url}")
# Output: https://playground.example.com/#codez=BIUwNmD2A0A...&lang=javascript&theme=dark&url=https%3A//docs.example.com

# Decode URL to get all parameters
params = decode_url(url)
print(f"Code: {params['codez']}")
print(f"Language: {params['lang']}")
print(f"Theme: {params['theme']}")
print(f"Documentation link: {params['url']}")

API Reference

LZString Class

class LZString:
    def compress_to_base64(self, input_str: str) -> str:
        """Compress string to Base64 format"""
        
    def decompress_from_base64(self, input_str: str) -> str:
        """Decompress string from Base64 format"""
        
    def compress_to_utf16(self, input_str: str) -> str:
        """Compress string to UTF16 format"""
        
    def decompress_from_utf16(self, input_str: str) -> str:
        """Decompress string from UTF16 format"""

URL Utility Functions

def encode_url(data: str, base_url: str = None, **kwargs) -> str:
    """
    Encode input string and build complete URL.
    
    Args:
        data: Data to be encoded
        base_url: URL prefix
        **kwargs: Additional URL parameters
        
    Returns:
        Built complete URL
    """

def decode_url(url: str) -> dict:
    """
    Decode original data from URL.
    
    Args:
        url: Complete URL
        
    Returns:
        Dictionary containing all parameters, with codez decoded
    """

Development

Environment Setup

# Clone repository
git clone https://github.com/rexwzh/lean4url.git
cd lean4url

# Create virtual environment
python -m venv venv
source venv/bin/activate  # Linux/Mac
# or
venv\Scripts\activate  # Windows

# Install development dependencies
pip install -e ".[dev]"

Running Tests

# Start JavaScript test service
cd tests/js_service
npm install
node server.js &
cd ../.. 

# Run Python tests
pytest

# Run tests with coverage
pytest --cov=lean4url --cov-report=html

Code Formatting

# Format code
black src tests
isort src tests

# Type checking
mypy src

# Code checking
flake8 src tests

Algorithm Principles

lean4url is based on a variant of the LZ78 compression algorithm, with core ideas:

  1. Dictionary Building - Dynamically build character sequence dictionary
  2. Sequence Matching - Find longest matching sequences
  3. UTF-16 Compatibility - Simulate JavaScript's UTF-16 surrogate pair behavior
  4. Base64 Encoding - Encode compression results in URL-safe format

Unicode Handling

The key difference from existing Python packages is in Unicode character handling:

  • JavaScript: Uses UTF-16 surrogate pairs, "𝔓" → [0xD835, 0xDCD3]
  • Existing Python packages: Use Unicode code points, "𝔓" → [0x1D4D3]
  • lean4url: Simulates JavaScript behavior, ensuring compatibility

License

MIT License - See the LICENSE file for details.

Contributing

Issues and Pull Requests are welcome!

Changelog

v1.0.0

  • Initial version release
  • Complete lzstring algorithm implementation
  • JavaScript compatibility
  • URL encoding/decoding functionality
  • Complete test suite

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lean4url-0.1.0.tar.gz (14.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

lean4url-0.1.0-py3-none-any.whl (10.3 kB view details)

Uploaded Python 3

File details

Details for the file lean4url-0.1.0.tar.gz.

File metadata

  • Download URL: lean4url-0.1.0.tar.gz
  • Upload date:
  • Size: 14.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.13

File hashes

Hashes for lean4url-0.1.0.tar.gz
Algorithm Hash digest
SHA256 564329adb16d93c05fa61fe4d8351402e39796b1781a5017a7040abf31d28b91
MD5 9e9e52beafbd5702aa58835867ae2708
BLAKE2b-256 763ce9ca6992172ff905e5db7995b488da3f91c04f108561de3691e459c6833c

See more details on using hashes here.

File details

Details for the file lean4url-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: lean4url-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 10.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.13

File hashes

Hashes for lean4url-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e69830064bc9f7a6e350708bd0c21369b026c95e1ce1d31a509e005fb4383371
MD5 b68c68482447315adb1d0e012327d027
BLAKE2b-256 304702cb9d65068c69d9533408bfbf878fd50b6ec464664e57b3e317b58e2292

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page