Skip to main content

YTsaurus pipeline framework with utilities and common modules

Project description

YT Framework

PyPI - Version Documentation Status CI Ask DeepWiki PyPI - Python Version coverage GitHub License

PyPI | Docs | DeepWiki | Examples


Overview

Python helpers and conventions for YTsaurus pipelines: YAML config, ordered stages under stages/, dev mode that mirrors many prod behaviors on disk, and prod mode that uploads src/ bundles to the cluster.

Architecture

  • Pipeline — loads config, builds the YT client, walks enabled_stages.
  • Stage — one BaseStage subclass plus config.yaml (and optional src/ for jobs).
  • Operations — map, vanilla, map-reduce/reduce, YQL via the client, S3 helpers, sorts, etc.
  • Configuration — OmegaConf-backed YAML; secrets in configs/secrets.env.

What ships in the box

  • Stage discovery (DefaultPipeline) from the filesystem layout.
  • dev / prod switch on the same code paths where possible.
  • Map, vanilla, YQL helpers, S3 listing/download patterns, table helpers, checkpoint upload wiring.
  • Optional custom Docker images, tokenizer tarballs, and multi-operation stages.

Installation

For Users

Install from PyPI into any Python 3.11+ environment (system Python, a virtualenv, or a Conda env):

pip install yt-framework

For Developers and Contributors

Recommended: one Conda environment for tests, formatting, pre-commit, and local documentation builds (avoids reinstalling tooling for each task):

git clone https://github.com/GregoryKogan/yt-framework.git
cd yt-framework
conda create -n yt-framework python=3.11
conda activate yt-framework
pip install -e ".[dev,docs]"

Use conda-forge as the channel when creating the env if that matches your setup (conda create -n yt-framework python=3.11 -c conda-forge).

Alternative: pip only — install in editable mode from source:

git clone https://github.com/GregoryKogan/yt-framework.git
cd yt-framework
pip install -e .

For development with testing tools (without the docs extra):

pip install -e ".[dev]"

For local Sphinx builds without the full dev extra, use pip install -e ".[docs]".

See CONTRIBUTING.md for the full development setup and Installation Guide for prerequisites.

Quick start

Three files: layout, entrypoint, stage + pipeline config.

  1. Layout

    mkdir my_pipeline && cd my_pipeline
    mkdir -p stages/my_stage configs
    
  2. pipeline.py

    from yt_framework.core.pipeline import DefaultPipeline
    
    if __name__ == "__main__":
        DefaultPipeline.main()
    
  3. Stage + config

    # stages/my_stage/stage.py
    from yt_framework.core.stage import BaseStage
    
    class MyStage(BaseStage):
        def run(self, debug):
            self.logger.info("Hello from YT Framework!")
            return debug
    
    # configs/config.yaml
    stages:
      enabled_stages:
        - my_stage
    
    pipeline:
      mode: "dev"  # Use "dev" for local development
    
python pipeline.py

Next: Docs quick start (table write), examples/, Pipelines and stages.

Examples

examples/ holds runnable trees; each folder has a README with scope and commands.

Requirements

Prerequisites

  • Python 3.11+
  • YT proxy + token when you run pipeline.mode: prod

YT Cluster Requirements

When running pipelines in production mode, code from ytjobs executes on YT cluster nodes. The cluster's Docker image (default or custom) must include:

  • Python 3.11+
  • ytsaurus-client >= 0.13.0 (for checkpoint operations)
  • boto3 == 1.35.99 (for S3 operations)
  • botocore == 1.35.99 (auto-installed with boto3)

If the cell default image lacks those pins, build a custom Docker image. Background: Cluster requirements.

Documentation

Getting help

Contributing

See CONTRIBUTING.md.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

yt_framework-1.4.2.tar.gz (172.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

yt_framework-1.4.2-py3-none-any.whl (138.0 kB view details)

Uploaded Python 3

File details

Details for the file yt_framework-1.4.2.tar.gz.

File metadata

  • Download URL: yt_framework-1.4.2.tar.gz
  • Upload date:
  • Size: 172.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for yt_framework-1.4.2.tar.gz
Algorithm Hash digest
SHA256 b1b9db6d54609cfee1e2ea173a9beecd9e50d8a7e516551698e680baa6bb8e5e
MD5 b1284ebc649aa94a61b4058ea8750c5c
BLAKE2b-256 dcabc16da8b38c0c03a1b2fb9a3fd195508145a990769fa04a9789d05a4b6325

See more details on using hashes here.

Provenance

The following attestation bundles were made for yt_framework-1.4.2.tar.gz:

Publisher: publish.yml on GregoryKogan/yt-framework

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file yt_framework-1.4.2-py3-none-any.whl.

File metadata

  • Download URL: yt_framework-1.4.2-py3-none-any.whl
  • Upload date:
  • Size: 138.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for yt_framework-1.4.2-py3-none-any.whl
Algorithm Hash digest
SHA256 8d98dbbdadfdb654893c2b56c84de1bfcc1d4673d080dafe9882f5884f13db1c
MD5 e4efd0e1c1a4b02467b965f9f3ec8699
BLAKE2b-256 be48700910c0971d38ab8832d4178847b73b126aecccbaa242e9059cc55fa53b

See more details on using hashes here.

Provenance

The following attestation bundles were made for yt_framework-1.4.2-py3-none-any.whl:

Publisher: publish.yml on GregoryKogan/yt-framework

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page