Skip to main content

YTsaurus pipeline framework with utilities and common modules

Project description

YT Framework

PyPI - Version Documentation Status CI Ask DeepWiki PyPI - Python Version coverage GitHub License

PyPI | Docs | DeepWiki | Examples


Overview

Python helpers and conventions for YTsaurus pipelines: YAML config, ordered stages under stages/, dev mode that mirrors many prod behaviors on disk, and prod mode that uploads src/ bundles to the cluster.

Architecture

  • Pipeline — loads config, builds the YT client, walks enabled_stages.
  • Stage — one BaseStage subclass plus config.yaml (and optional src/ for jobs).
  • Operations — map, vanilla, map-reduce/reduce, YQL via the client, S3 helpers, sorts, etc.
  • Configuration — OmegaConf-backed YAML; secrets in configs/secrets.env.

What ships in the box

  • Stage discovery (DefaultPipeline) from the filesystem layout.
  • dev / prod switch on the same code paths where possible.
  • Map, vanilla, YQL helpers, S3 listing/download patterns, table helpers, checkpoint upload wiring.
  • Optional custom Docker images, tokenizer tarballs, and multi-operation stages.

Installation

For Users

Install from PyPI into any Python 3.11+ environment (system Python, a virtualenv, or a Conda env):

pip install yt-framework

For Developers and Contributors

Recommended: one Conda environment for tests, formatting, pre-commit, and local documentation builds (avoids reinstalling tooling for each task):

git clone https://github.com/GregoryKogan/yt-framework.git
cd yt-framework
conda create -n yt-framework python=3.11
conda activate yt-framework
pip install -e ".[dev,docs]"

Use conda-forge as the channel when creating the env if that matches your setup (conda create -n yt-framework python=3.11 -c conda-forge).

Alternative: pip only — install in editable mode from source:

git clone https://github.com/GregoryKogan/yt-framework.git
cd yt-framework
pip install -e .

For development with testing tools (without the docs extra):

pip install -e ".[dev]"

For local Sphinx builds without the full dev extra, use pip install -e ".[docs]".

See CONTRIBUTING.md for the full development setup and Installation Guide for prerequisites.

Quick start

Three files: layout, entrypoint, stage + pipeline config.

  1. Layout

    mkdir my_pipeline && cd my_pipeline
    mkdir -p stages/my_stage configs
    
  2. pipeline.py

    from yt_framework.core.pipeline import DefaultPipeline
    
    if __name__ == "__main__":
        DefaultPipeline.main()
    
  3. Stage + config

    # stages/my_stage/stage.py
    from yt_framework.core.stage import BaseStage
    
    class MyStage(BaseStage):
        def run(self, debug):
            self.logger.info("Hello from YT Framework!")
            return debug
    
    # configs/config.yaml
    stages:
      enabled_stages:
        - my_stage
    
    pipeline:
      mode: "dev"  # Use "dev" for local development
    
python pipeline.py

Next: Docs quick start (table write), examples/, Pipelines and stages.

Examples

examples/ holds runnable trees; each folder has a README with scope and commands.

Requirements

Prerequisites

  • Python 3.11+
  • YT proxy + token when you run pipeline.mode: prod

YT Cluster Requirements

When running pipelines in production mode, code from ytjobs executes on YT cluster nodes. The cluster's Docker image (default or custom) must include:

  • Python 3.11+
  • ytsaurus-client >= 0.13.0 (for checkpoint operations)
  • boto3 == 1.35.99 (for S3 operations)
  • botocore == 1.35.99 (auto-installed with boto3)

If the cell default image lacks those pins, build a custom Docker image. Background: Cluster requirements.

Documentation

Getting help

Contributing

See CONTRIBUTING.md.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

yt_framework-1.3.7.tar.gz (142.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

yt_framework-1.3.7-py3-none-any.whl (108.2 kB view details)

Uploaded Python 3

File details

Details for the file yt_framework-1.3.7.tar.gz.

File metadata

  • Download URL: yt_framework-1.3.7.tar.gz
  • Upload date:
  • Size: 142.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for yt_framework-1.3.7.tar.gz
Algorithm Hash digest
SHA256 1c1cf79bab573d094eff0c93d539a51a353af4650d94bee94e3f35506d18c8e9
MD5 95ad54372d0373af2d13d79ac7853bd9
BLAKE2b-256 a8c20c95ab2837104e60a498bd41fa595cd83b00d97410e955029e6fc04699c0

See more details on using hashes here.

Provenance

The following attestation bundles were made for yt_framework-1.3.7.tar.gz:

Publisher: publish.yml on GregoryKogan/yt-framework

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file yt_framework-1.3.7-py3-none-any.whl.

File metadata

  • Download URL: yt_framework-1.3.7-py3-none-any.whl
  • Upload date:
  • Size: 108.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for yt_framework-1.3.7-py3-none-any.whl
Algorithm Hash digest
SHA256 bbb457054d0e1943db8db87d564bc9014e5a05ce97b8c1ba4d3fc67958a55019
MD5 55353a813ddb29edf2069926f8f2d357
BLAKE2b-256 4d2aad123e8708748a264949d1798660615da4c48605db6147ecb84d64defa92

See more details on using hashes here.

Provenance

The following attestation bundles were made for yt_framework-1.3.7-py3-none-any.whl:

Publisher: publish.yml on GregoryKogan/yt-framework

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page