Skip to main content

A flexible data transformation library with a plugin system

Project description

🌀 Tukuy

Portable agent skills library and data transformation toolkit for Python.

PyPI version Python versions License: MIT Downloads GitHub stars


Tukuy (meaning "to transform" or "to become" in Quechua) is a cross-platform skills layer that any Python agent framework can use. It provides typed skill descriptors, agent-framework bridges (OpenAI, Anthropic), async-first execution, smart composition, and runtime safety enforcement — all built on top of a proven plugin-based transformation engine.

from tukuy import skill

@skill(name="parse_date", description="Parse a date string into ISO format")
def parse_date(text: str) -> str:
    from dateutil import parser
    return parser.parse(text).isoformat()

result = parse_date.__skill__.invoke("January 15, 2025")
print(result.value)  # "2025-01-15T00:00:00"

Installation

pip install tukuy

Quick Start

Define a skill

from tukuy import skill

@skill(
    name="parse_date",
    description="Parse a date string into ISO format",
    category="date",
    tags=["parsing", "datetime"],
)
def parse_date(text: str, format: str = "auto") -> str:
    """Parse date from text, return ISO 8601."""
    from dateutil import parser
    return parser.parse(text).isoformat()

The @skill decorator infers input/output schemas from type hints, detects async functions automatically, and attaches a Skill instance as fn.__skill__.

Invoke a skill

result = parse_date.__skill__.invoke("January 15, 2025")
print(result.value)      # "2025-01-15T00:00:00"
print(result.success)    # True
print(result.duration_ms) # 0.42

Use with an agent framework

from tukuy import to_openai_tools, to_anthropic_tools

skills = [parse_date, extract_entities, summarize]

# OpenAI function-calling format
tools = to_openai_tools(skills)

# Anthropic tool_use format
tools = to_anthropic_tools(skills)

Dispatch tool calls back to skills:

from tukuy import dispatch_openai, dispatch_anthropic

# OpenAI
result_msg = dispatch_openai(tool_call, skills)

# Anthropic
result_block = dispatch_anthropic(tool_use, skills)

Core Concepts

Skill Descriptors

Every skill has a declared-upfront contract via SkillDescriptor:

from tukuy import SkillDescriptor

descriptor = SkillDescriptor(
    name="web_scraper",
    description="Scrape and extract text from a URL",
    input_schema=str,
    output_schema=str,
    category="web",
    tags=["scraping", "extraction"],
    is_async=True,
    requires_network=True,
    required_imports=["aiohttp", "beautifulsoup4"],
    idempotent=True,
    side_effects=False,
)

Descriptors carry identity, typed I/O schemas, discovery metadata, operational hints, and safety declarations — everything an agent framework needs to discover and invoke a skill.

Async Support

Skills work with both sync and async functions:

@skill(name="fetch_page", requires_network=True)
async def fetch_page(url: str) -> str:
    async with aiohttp.ClientSession() as session:
        async with session.get(url) as resp:
            return await resp.text()

# Async invocation
result = await fetch_page.__skill__.ainvoke("https://example.com")

Sync skills also work with ainvoke() — they're called normally without blocking the event loop.

Composition

Chain, branch, and fan-out skills with Chain, Branch, and Parallel:

from tukuy import Chain, branch, parallel

# Sequential pipeline
chain = Chain(["strip", "lowercase", parse_date])
result = chain.run("  January 15, 2025  ")

# Conditional branching
chain = Chain([
    "strip",
    branch(
        on_match=lambda v: "@" in v,
        true_path=["email_validator"],
        false_path=["url_validator"],
    ),
])

# Parallel fan-out with merge
chain = Chain([
    parallel(
        steps=["extract_dates", "extract_emails", "extract_phones"],
        merge="dict",  # {"extract_dates": [...], "extract_emails": [...], ...}
    ),
])

# Async execution with asyncio.gather for parallel steps
result = await chain.arun(input_text)

Steps can be transformer names (strings), parametrized transforms (dicts), Skill instances, @skill-decorated functions, plain callables, or nested Chain objects.

Context

Skills share state through a typed, scoped SkillContext:

from tukuy import skill, SkillContext

@skill(name="extract_entities")
def extract_entities(text: str, ctx: SkillContext) -> dict:
    entities = do_extraction(text)
    ctx.set("last_entities", entities)
    return entities

@skill(name="format_entities")
def format_entities(ctx: SkillContext) -> str:
    entities = ctx.get("last_entities")
    return format_them(entities)

Context supports namespaced scoping (for parallel branches), parent-child delegation, snapshot/merge, and bridging to plain dicts.

Safety Policy

Each skill declares what resources it needs. The runtime enforces these declarations against an active policy:

from tukuy import SafetyPolicy, set_policy

# Define what the environment allows
policy = SafetyPolicy(
    allowed_imports={"json", "re", "datetime"},
    blocked_imports={"os", "subprocess"},
    allow_network=False,
    allow_filesystem=False,
)

# Activate globally (async-safe via contextvars)
set_policy(policy)

# Skills that violate the policy are blocked before execution
@skill(name="web_scraper", requires_network=True)
async def web_scraper(url: str) -> str: ...

result = web_scraper.__skill__.invoke("https://example.com")
# result.success == False
# result.error == "Safety policy violated: Skill requires network access but policy denies it"

Convenience constructors for common scenarios:

SafetyPolicy.restrictive()    # No imports, no network, no filesystem
SafetyPolicy.permissive()     # Everything allowed
SafetyPolicy.network_only()   # Network yes, filesystem no
SafetyPolicy.filesystem_only() # Filesystem yes, network no

Policies can be exported/imported as sandbox configurations for integration with external sandbox runtimes:

config = policy.to_sandbox_config()
# {"allowed_imports": ["json", "re"], "network": False, "filesystem": False}

policy = SafetyPolicy.from_sandbox_config(config)

Data Transformations

Tukuy includes a full transformation engine with six built-in plugins. This is the foundation that the skills layer is built on.

from tukuy import TukuyTransformer

t = TukuyTransformer()

# Text
t.transform("  Hello World!  ", ["strip", "lowercase"])
# "hello world!"

# HTML
t.transform("<div>Hello <b>World</b>!</div>", ["strip_html_tags", "lowercase"])
# "hello world!"

# Chained with parameters
t.transform("  Hello World!  ", [
    "strip",
    "lowercase",
    {"function": "truncate", "length": 5},
])
# "hello..."

Built-in Plugins

Textstrip, lowercase, uppercase, truncate, replace, regex_replace, split, join, normalize

HTMLstrip_html_tags, extract_text, select, extract_links, extract_tables, clean_html

JSONparse_json, stringify, extract, flatten, merge, validate_schema

Dateparse_date, format_date, age_calc, add_days, diff_days, is_weekend

Numericalround, format_number, to_currency, percentage, math_eval, scale, statistics

Validationemail_validator, url_validator, phone_validator, length_validator, range_validator, regex_validator, type_validator

Custom Plugins

Extend Tukuy with your own transformer plugins:

from tukuy import TransformerPlugin
from tukuy.base import ChainableTransformer

class ReverseTransformer(ChainableTransformer):
    def validate(self, value):
        return isinstance(value, str)

    def _transform(self, value, context=None):
        return value[::-1]

class MyPlugin(TransformerPlugin):
    def __init__(self):
        super().__init__("my_plugin")

    @property
    def transformers(self):
        return {"reverse": lambda _: ReverseTransformer("reverse")}

t = TukuyTransformer()
t.register_plugin(MyPlugin())
t.transform("hello", ["reverse"])  # "olleh"

Plugins support lifecycle hooks (initialize() / cleanup()) and can expose skills alongside transformers via the skills property.

Dynamic Tool Registration

Tukuy makes it easy to add tools at runtime without restarting your application.

Register a plugin on the fly:

from tukuy import TukuyTransformer, TransformerPlugin

class MyPlugin(TransformerPlugin):
    def __init__(self):
        super().__init__("my_plugin")

    @property
    def transformers(self):
        return {"reverse": lambda _: ReverseTransformer("reverse")}

t = TukuyTransformer()
t.register_plugin(MyPlugin())       # available immediately
t.transform("hello", ["reverse"])   # "olleh"
t.unregister_plugin("my_plugin")    # remove when no longer needed

Create skills at runtime:

from tukuy import skill, to_openai_tools

@skill(name="sentiment", description="Classify sentiment", category="nlp")
def sentiment(text: str) -> str:
    return "positive" if "good" in text.lower() else "negative"

# Instantly usable — invoke directly or convert to agent tool format
result = sentiment.__skill__.invoke("This is good!")
tools = to_openai_tools([sentiment])  # ready for OpenAI function-calling

Use the @tukuy_plugin decorator for metadata:

from tukuy import tukuy_plugin

@tukuy_plugin("analytics", "Real-time analytics transforms", "1.0.0")
class AnalyticsPlugin(TransformerPlugin):
    @property
    def transformers(self):
        return {"moving_avg": lambda p: MovingAvgTransformer("moving_avg", **p)}

Hot-reload plugins without restarting:

from tukuy import hot_reload

hot_reload("my_plugin")  # reload a specific plugin
hot_reload()             # reload all plugins

Discover what's available:

from tukuy import browse_tools, get_tool_details, search_tools

index = browse_tools()                      # compact index of all tools
details = get_tool_details("reverse")       # full metadata for a specific tool
results = search_tools("date", limit=5)     # keyword search across all tools

Pattern-based Extraction

HTML

pattern = {
    "properties": [
        {"name": "title", "selector": "h1", "transform": ["strip", "lowercase"]},
        {"name": "links", "selector": "a", "attribute": "href", "type": "array"},
    ]
}
data = t.extract_html_with_pattern(html, pattern)

JSON

pattern = {
    "properties": [
        {
            "name": "user",
            "selector": "data.user",
            "properties": [
                {"name": "name", "selector": "fullName", "transform": ["strip"]},
            ],
        }
    ]
}
data = t.extract_json_with_pattern(json_str, pattern)

Error Handling

from tukuy.exceptions import ValidationError, TransformationError, ParseError

try:
    result = t.transform(data, transformations)
except ValidationError as e:
    print(f"Validation failed: {e}")
except ParseError as e:
    print(f"Parsing failed: {e}")
except TransformationError as e:
    print(f"Transformation failed: {e}")

Architecture

tukuy/
    skill.py          Skill descriptors, @skill decorator, invoke/ainvoke
    context.py        SkillContext with scoping, snapshot, merge
    safety.py         SafetyPolicy, manifest validation, sandbox integration
    bridges.py        OpenAI and Anthropic tool format bridges
    chain.py          Chain, Branch, Parallel composition
    async_base.py     Async transformer base classes
    base.py           Sync transformer base classes
    plugins/          Built-in plugins (text, html, json, date, numerical, validation)
    core/             Registration, introspection, unified registry
    transformers/     Transformer implementations

Contributing

Contributions are welcome.

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/my-feature)
  3. Make your changes
  4. Run tests with pytest
  5. Commit and push
  6. Open a Pull Request

License

See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tukuy-0.0.17.tar.gz (167.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tukuy-0.0.17-py3-none-any.whl (151.6 kB view details)

Uploaded Python 3

File details

Details for the file tukuy-0.0.17.tar.gz.

File metadata

  • Download URL: tukuy-0.0.17.tar.gz
  • Upload date:
  • Size: 167.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for tukuy-0.0.17.tar.gz
Algorithm Hash digest
SHA256 fd47f8d11f2c437e30c0f4857f3a6c53ae07abd21483930f69892e2d5265de5d
MD5 64dab46f37d63c013c25af01b2804d7a
BLAKE2b-256 e94df6e47b34be37da52c43046ac3063754d2f61c0cbfc7be2536bb15b403e1b

See more details on using hashes here.

File details

Details for the file tukuy-0.0.17-py3-none-any.whl.

File metadata

  • Download URL: tukuy-0.0.17-py3-none-any.whl
  • Upload date:
  • Size: 151.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for tukuy-0.0.17-py3-none-any.whl
Algorithm Hash digest
SHA256 e862bbc6f5a1b9b51e433c4ea4d9d6dac24b2d683692d6dab0c5cc5986beac71
MD5 5e1e657b0a24ec74c4f6276c1aaab98c
BLAKE2b-256 bf459b4a60e9bdea26dad4bc5c4ad0183d63dcc21d9dc40b1b1114378deda533

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page