A flexible data transformation library with a plugin system

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Project description

🌀 Tukuy

A flexible data transformation library with a plugin system for Python.

🚀 Overview

Tukuy (meaning "to transform" or "to become" in Quechua) is a powerful and extensible data transformation library that makes it easy to manipulate, validate, and extract data from various formats. With its plugin architecture, Tukuy provides a unified interface for working with text, HTML, JSON, dates, numbers, and more.

✨ Features

🧩 Plugin System: Easily extend functionality with custom plugins
🔄 Chainable Transformers: Compose multiple transformations in sequence
🧪 Type-safe Transformations: With built-in validation
📊 Rich Set of Built-in Transformers:
- 📝 Text manipulation (case conversion, trimming, regex, etc.)
- 🌐 HTML processing and extraction
- 📅 Date parsing and calculations
- 🔢 Numerical operations
- ✅ Data validation
- 📋 JSON parsing and extraction
🔍 Pattern-based Data Extraction: Extract structured data from HTML and JSON
🛡️ Error Handling: Comprehensive error handling with detailed messages

📦 Installation

pip install tukuy

🛠️ Basic Usage

from tukuy import ToolsTransformer

# Create transformer
TUKUY = ToolsTransformer()

# Basic text transformation
text = " Hello World! "
result = TUKUY.transform(text, [
    "strip",
    "lowercase",
    {"function": "truncate", "length": 5}
])
print(result)  # "hello..."

# HTML transformation
html = "<div>Hello <b>World</b>!</div>"
result = TUKUY.transform(html, [
    "strip_html_tags",
    "lowercase"
])
print(result)  # "hello world!"

# Date transformation
date_str = "2023-01-01"
age = TUKUY.transform(date_str, [
    {"function": "age_calc"}
])
print(age)  # 1

# Validation
email = "test@example.com"
valid = TUKUY.transform(email, ["email_validator"])
print(valid)  # "test@example.com" or None if invalid

🧩 Plugin System

Tukuy uses a plugin system to organize transformers into logical groups and make it easy to extend functionality.

📚 Built-in Plugins

📝 text: Basic text transformations (strip, lowercase, regex, etc.)
🌐 html: HTML manipulation and extraction
📅 date: Date parsing and calculations
✅ validation: Data validation and formatting
🔢 numerical: Number manipulation and calculations
📋 json: JSON parsing and extraction

🔌 Creating Custom Plugins

You can create custom plugins by extending the TransformerPlugin class:

from tukuy.plugins import TransformerPlugin
from tukuy.base import ChainableTransformer

class ReverseTransformer(ChainableTransformer[str, str]):
    def validate(self, value: str) -> bool:
        return isinstance(value, str)
    
    def _transform(self, value: str, context=None) -> str:
        return value[::-1]

class MyPlugin(TransformerPlugin):
    def __init__(self):
        super().__init__("my_plugin")
    
    @property
    def transformers(self):
        return {
            'reverse': lambda _: ReverseTransformer('reverse')
        }

# Usage
TUKUY = ToolsTransformer()
TUKUY.register_plugin(MyPlugin())

result = TUKUY.transform("hello", ["reverse"])  # "olleh"

See the example plugin for a more detailed example.

🔄 Plugin Lifecycle

Plugins can implement initialize() and cleanup() methods for setup and teardown:

class MyPlugin(TransformerPlugin):
    def initialize(self) -> None:
        super().initialize()
        # Load resources, connect to databases, etc.
    
    def cleanup(self) -> None:
        super().cleanup()
        # Close connections, free resources, etc.

🔍 Pattern-based Extraction

Tukuy provides powerful pattern-based extraction capabilities for both HTML and JSON data.

🌐 HTML Extraction

pattern = {
    "properties": [
        {
            "name": "title",
            "selector": "h1",
            "transform": ["strip", "lowercase"]
        },
        {
            "name": "links",
            "selector": "a",
            "attribute": "href",
            "type": "array"
        }
    ]
}

data = TUKUY.extract_html_with_pattern(html, pattern)

📋 JSON Extraction

pattern = {
    "properties": [
        {
            "name": "user",
            "selector": "data.user",
            "properties": [
                {
                    "name": "name",
                    "selector": "fullName",
                    "transform": ["strip"]
                }
            ]
        }
    ]
}

data = TUKUY.extract_json_with_pattern(json_str, pattern)

🚀 Use Cases

Tukuy is designed to handle a wide range of data transformation scenarios:

🌐 Web Scraping: Extract structured data from HTML pages
📊 Data Cleaning: Normalize and validate data from various sources
🔄 Format Conversion: Transform data between different formats
📝 Text Processing: Apply complex text transformations
🔍 Data Extraction: Extract specific information from complex structures
✅ Validation: Ensure data meets specific criteria

⚡ Performance Tips

🔗 Chain Transformations: Use chained transformations to avoid intermediate objects
🧩 Use Built-in Transformers: Built-in transformers are optimized for performance
🔍 Be Specific with Selectors: More specific selectors are faster to process
🛠️ Custom Transformers: For performance-critical operations, create custom transformers
📦 Batch Processing: Process data in batches for better performance

🛡️ Error Handling

Tukuy provides comprehensive error handling with detailed error messages:

from tukuy.exceptions import ValidationError, TransformationError, ParseError

try:
    result = TUKUY.transform(data, transformations)
except ValidationError as e:
    print(f"Validation failed: {e}")
except ParseError as e:
    print(f"Parsing failed: {e}")
except TransformationError as e:
    print(f"Transformation failed: {e}")

🤝 Contributing

Contributions are welcome! Here's how you can help:

🍴 Fork the repository
🌿 Create a feature branch (git checkout -b feature/amazing-feature)
💻 Make your changes
✅ Run tests with pytest
📝 Update documentation if needed
🔄 Commit your changes (git commit -m 'Add amazing feature')
🚀 Push to the branch (git push origin feature/amazing-feature)
🔍 Open a Pull Request

Project details

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

0.0.36

Mar 8, 2026

0.0.35

Mar 8, 2026

0.0.34

Mar 8, 2026

0.0.33

Mar 7, 2026

0.0.32

Mar 7, 2026

0.0.31

Feb 28, 2026

0.0.30

Feb 18, 2026

0.0.29

Feb 17, 2026

0.0.27

Feb 17, 2026

0.0.26

Feb 17, 2026

0.0.25

Feb 17, 2026

0.0.24

Feb 16, 2026

0.0.23

Feb 16, 2026

0.0.22

Feb 9, 2026

0.0.21

Feb 9, 2026

0.0.20

Feb 8, 2026

0.0.19

Feb 8, 2026

0.0.18

Feb 8, 2026

0.0.17

Feb 8, 2026

0.0.16

Feb 8, 2026

0.0.15

Feb 8, 2026

0.0.14

Feb 8, 2026

0.0.13

Feb 8, 2026

0.0.12

Feb 8, 2026

0.0.11

Feb 8, 2026

0.0.6

Sep 26, 2025

0.0.5

Sep 26, 2025

0.0.4

Sep 8, 2025

0.0.3

Mar 24, 2025

This version

0.0.2

Mar 24, 2025

0.0.1

Mar 24, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tukuy-0.0.2.tar.gz (30.6 kB view details)

Uploaded Mar 24, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

tukuy-0.0.2-py3-none-any.whl (29.2 kB view details)

Uploaded Mar 24, 2025 Python 3

File details

Details for the file tukuy-0.0.2.tar.gz.

File metadata

Download URL: tukuy-0.0.2.tar.gz
Upload date: Mar 24, 2025
Size: 30.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for tukuy-0.0.2.tar.gz
Algorithm	Hash digest
SHA256	`ad25e93d2f2b31d947104d70ab78d345deb8718fa233081a00b9e133942bc4a3`
MD5	`7a87c3c4b2adc1a214ff4d255cb53321`
BLAKE2b-256	`7f5650bbd9c7674140cdcccc3c3c22ae47356911949b030209ab917f33f4bb2f`

See more details on using hashes here.

File details

Details for the file tukuy-0.0.2-py3-none-any.whl.

File metadata

Download URL: tukuy-0.0.2-py3-none-any.whl
Upload date: Mar 24, 2025
Size: 29.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for tukuy-0.0.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`e5664382069101f220a6bb259e065d60ffa3890f9579dc8cec536703b502799d`
MD5	`c5cf2b678ee591813daa3812424a22d1`
BLAKE2b-256	`42c5da9b8f65ded8c5627b56ea6e0c5ff8955d49e3455794fed9f2b1b6e74932`

See more details on using hashes here.

tukuy 0.0.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

🌀 Tukuy

🚀 Overview

✨ Features

📦 Installation

🛠️ Basic Usage

🧩 Plugin System

📚 Built-in Plugins

🔌 Creating Custom Plugins

🔄 Plugin Lifecycle

🔍 Pattern-based Extraction

🌐 HTML Extraction

📋 JSON Extraction

🚀 Use Cases

⚡ Performance Tips

🛡️ Error Handling

🤝 Contributing

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes