Skip to main content

YTsaurus pipeline framework with utilities and common modules

Project description

YT Framework

PyPI - Version Documentation Status CI Ask DeepWiki PyPI - Python Version coverage GitHub License

PyPI | Docs | DeepWiki | Examples


Overview

Python helpers and conventions for YTsaurus pipelines: YAML config, ordered stages under stages/, dev mode that mirrors many prod behaviors on disk, and prod mode that uploads src/ bundles to the cluster.

Architecture

  • Pipeline — loads config, builds the YT client, walks enabled_stages.
  • Stage — one BaseStage subclass plus config.yaml (and optional src/ for jobs).
  • Operations — map, vanilla, map-reduce/reduce, YQL via the client, S3 helpers, sorts, etc.
  • Configuration — OmegaConf-backed YAML; secrets in configs/secrets.env.

What ships in the box

  • Stage discovery (DefaultPipeline) from the filesystem layout.
  • dev / prod switch on the same code paths where possible.
  • Map, vanilla, YQL helpers, S3 listing/download patterns, table helpers, checkpoint upload wiring.
  • Optional custom Docker images, tokenizer tarballs, and multi-operation stages.

Installation

For Users

Install from PyPI into any Python 3.11+ environment (system Python, a virtualenv, or a Conda env):

pip install yt-framework

For Developers and Contributors

Recommended: one Conda environment for tests, formatting, pre-commit, and local documentation builds (avoids reinstalling tooling for each task):

git clone https://github.com/GregoryKogan/yt-framework.git
cd yt-framework
conda create -n yt-framework python=3.11
conda activate yt-framework
pip install -e ".[dev,docs]"

Use conda-forge as the channel when creating the env if that matches your setup (conda create -n yt-framework python=3.11 -c conda-forge).

Alternative: pip only — install in editable mode from source:

git clone https://github.com/GregoryKogan/yt-framework.git
cd yt-framework
pip install -e .

For development with testing tools (without the docs extra):

pip install -e ".[dev]"

For local Sphinx builds without the full dev extra, use pip install -e ".[docs]".

See CONTRIBUTING.md for the full development setup and Installation Guide for prerequisites.

Quick start

Three files: layout, entrypoint, stage + pipeline config.

  1. Layout

    mkdir my_pipeline && cd my_pipeline
    mkdir -p stages/my_stage configs
    
  2. pipeline.py

    from yt_framework.core.pipeline import DefaultPipeline
    
    if __name__ == "__main__":
        DefaultPipeline.main()
    
  3. Stage + config

    # stages/my_stage/stage.py
    from yt_framework.core.stage import BaseStage
    
    class MyStage(BaseStage):
        def run(self, debug):
            self.logger.info("Hello from YT Framework!")
            return debug
    
    # configs/config.yaml
    stages:
      enabled_stages:
        - my_stage
    
    pipeline:
      mode: "dev"  # Use "dev" for local development
    
python pipeline.py

Next: Docs quick start (table write), examples/, Pipelines and stages.

Examples

examples/ holds runnable trees; each folder has a README with scope and commands.

Requirements

Prerequisites

  • Python 3.11+
  • YT proxy + token when you run pipeline.mode: prod

YT Cluster Requirements

When running pipelines in production mode, code from ytjobs executes on YT cluster nodes. The cluster's Docker image (default or custom) must include:

  • Python 3.11+
  • ytsaurus-client >= 0.13.0 (for checkpoint operations)
  • boto3 == 1.35.99 (for S3 operations)
  • botocore == 1.35.99 (auto-installed with boto3)

If the cell default image lacks those pins, build a custom Docker image. Background: Cluster requirements.

Documentation

Getting help

Contributing

See CONTRIBUTING.md.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

yt_framework-1.3.5.tar.gz (138.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

yt_framework-1.3.5-py3-none-any.whl (105.5 kB view details)

Uploaded Python 3

File details

Details for the file yt_framework-1.3.5.tar.gz.

File metadata

  • Download URL: yt_framework-1.3.5.tar.gz
  • Upload date:
  • Size: 138.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for yt_framework-1.3.5.tar.gz
Algorithm Hash digest
SHA256 3c589b763ddd0a54302d1e789dbcf6f85af0c51bb53eddf01c05db1ec9070a09
MD5 6c0fb5471aa6d9c64e1f44b92004cb2b
BLAKE2b-256 37ca6173b2d6144f9b13cd4b660ffb1a7e40ea840524b1e7cd4fa217315a6411

See more details on using hashes here.

Provenance

The following attestation bundles were made for yt_framework-1.3.5.tar.gz:

Publisher: publish.yml on GregoryKogan/yt-framework

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file yt_framework-1.3.5-py3-none-any.whl.

File metadata

  • Download URL: yt_framework-1.3.5-py3-none-any.whl
  • Upload date:
  • Size: 105.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for yt_framework-1.3.5-py3-none-any.whl
Algorithm Hash digest
SHA256 27cf3e1719fb296fcd05b1c2430606670242a6f050bd4ab24449e24334b36103
MD5 0001ed5c162bdadbd2533361176e1664
BLAKE2b-256 8a570856e08dda0fc6c12ce6b3618e929ab3cfffb00b4b8982b080bd2b2e813c

See more details on using hashes here.

Provenance

The following attestation bundles were made for yt_framework-1.3.5-py3-none-any.whl:

Publisher: publish.yml on GregoryKogan/yt-framework

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page