Skip to main content

YTsaurus pipeline framework with utilities and common modules

Project description

YT Framework

PyPI - Version Documentation Status CI Ask DeepWiki PyPI - Python Version coverage GitHub License

PyPI | Docs | DeepWiki | Examples


Overview

Python helpers and conventions for YTsaurus pipelines: YAML config, ordered stages under stages/, dev mode that mirrors many prod behaviors on disk, and prod mode that uploads src/ bundles to the cluster.

Architecture

  • Pipeline — loads config, builds the YT client, walks enabled_stages.
  • Stage — one BaseStage subclass plus config.yaml (and optional src/ for jobs).
  • Operations — map, vanilla, map-reduce/reduce, YQL via the client, S3 helpers, sorts, etc.
  • Configuration — OmegaConf-backed YAML; secrets in configs/secrets.env.

What ships in the box

  • Stage discovery (DefaultPipeline) from the filesystem layout.
  • dev / prod switch on the same code paths where possible.
  • Map, vanilla, YQL helpers, S3 listing/download patterns, table helpers, checkpoint upload wiring.
  • Optional custom Docker images, tokenizer tarballs, and multi-operation stages.

Installation

For Users

Install from PyPI into any Python 3.11+ environment (system Python, a virtualenv, or a Conda env):

pip install yt-framework

For Developers and Contributors

Recommended: one Conda environment for tests, formatting, pre-commit, and local documentation builds (avoids reinstalling tooling for each task):

git clone https://github.com/GregoryKogan/yt-framework.git
cd yt-framework
conda create -n yt-framework python=3.11
conda activate yt-framework
pip install -e ".[dev,docs]"

Use conda-forge as the channel when creating the env if that matches your setup (conda create -n yt-framework python=3.11 -c conda-forge).

Alternative: pip only — install in editable mode from source:

git clone https://github.com/GregoryKogan/yt-framework.git
cd yt-framework
pip install -e .

For development with testing tools (without the docs extra):

pip install -e ".[dev]"

For local Sphinx builds without the full dev extra, use pip install -e ".[docs]".

See CONTRIBUTING.md for the full development setup and Installation Guide for prerequisites.

Quick start

Three files: layout, entrypoint, stage + pipeline config.

  1. Layout

    mkdir my_pipeline && cd my_pipeline
    mkdir -p stages/my_stage configs
    
  2. pipeline.py

    from yt_framework.core.pipeline import DefaultPipeline
    
    if __name__ == "__main__":
        DefaultPipeline.main()
    
  3. Stage + config

    # stages/my_stage/stage.py
    from yt_framework.core.stage import BaseStage
    
    class MyStage(BaseStage):
        def run(self, debug):
            self.logger.info("Hello from YT Framework!")
            return debug
    
    # configs/config.yaml
    stages:
      enabled_stages:
        - my_stage
    
    pipeline:
      mode: "dev"  # Use "dev" for local development
    
python pipeline.py

Next: Docs quick start (table write), examples/, Pipelines and stages.

Examples

examples/ holds runnable trees; each folder has a README with scope and commands.

Requirements

Prerequisites

  • Python 3.11+
  • YT proxy + token when you run pipeline.mode: prod

YT Cluster Requirements

When running pipelines in production mode, code from ytjobs executes on YT cluster nodes. The cluster's Docker image (default or custom) must include:

  • Python 3.11+
  • ytsaurus-client >= 0.13.0 (for checkpoint operations)
  • boto3 == 1.35.99 (for S3 operations)
  • botocore == 1.35.99 (auto-installed with boto3)

If the cell default image lacks those pins, build a custom Docker image. Background: Cluster requirements.

Documentation

Getting help

Contributing

See CONTRIBUTING.md.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

yt_framework-1.3.4.tar.gz (137.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

yt_framework-1.3.4-py3-none-any.whl (105.5 kB view details)

Uploaded Python 3

File details

Details for the file yt_framework-1.3.4.tar.gz.

File metadata

  • Download URL: yt_framework-1.3.4.tar.gz
  • Upload date:
  • Size: 137.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for yt_framework-1.3.4.tar.gz
Algorithm Hash digest
SHA256 4b225814f8e803c152565e7d78cc96697ddd2aa7c3e8db2108befc8c4b918d83
MD5 6b1a598991991ef1f133cb8a3f2e22fb
BLAKE2b-256 1c59585bc42db67fbd0aaa2cc3476634fcb612d22e8a9cbbbf5481ef4dff38b7

See more details on using hashes here.

Provenance

The following attestation bundles were made for yt_framework-1.3.4.tar.gz:

Publisher: publish.yml on GregoryKogan/yt-framework

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file yt_framework-1.3.4-py3-none-any.whl.

File metadata

  • Download URL: yt_framework-1.3.4-py3-none-any.whl
  • Upload date:
  • Size: 105.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for yt_framework-1.3.4-py3-none-any.whl
Algorithm Hash digest
SHA256 6aa96b9c8569a8cd2097c17fb631b4ced5f72b11eab03c1e014bf144b6491900
MD5 f2ce21d070fb44eea0e8b5113c89dea4
BLAKE2b-256 ee5fda7386f91a9ca061740efb1dd853007c671c011d0cf6f77d9e32f1a16089

See more details on using hashes here.

Provenance

The following attestation bundles were made for yt_framework-1.3.4-py3-none-any.whl:

Publisher: publish.yml on GregoryKogan/yt-framework

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page