Skip to main content

Type-friendly utilities for moving data between Python objects, Arrow, Polars, Pandas, Spark, and Databricks

Project description

Yggdrasil (Python)

Schema-aware utilities for moving data between Python objects, Arrow, Polars, pandas, Spark, and Databricks. Define types once — cast everywhere.

Install

pip install ygg                           # core (Arrow, requests, pyutils)
pip install "ygg[polars]"                # + Polars
pip install "ygg[pandas]"                # + pandas
pip install "ygg[spark]"                 # + PySpark
pip install "ygg[databricks]"            # + Databricks SDK

From source (dev):

cd python/
uv venv .venv && source .venv/bin/activate
uv pip install -e ".[dev]"

Quickstart

Infer Arrow schema from a dataclass

from dataclasses import dataclass
from yggdrasil.dataclasses import dataclass_to_arrow_field

@dataclass
class Order:
    id: int
    amount: float
    country: str | None = None

field = dataclass_to_arrow_field(Order)
print(field.type)           # struct<id: int64, amount: double, country: string>
schema = field.type.to_schema()

Cast any table to an Arrow schema

import pyarrow as pa
from yggdrasil.arrow.cast import cast_arrow_tabular
from yggdrasil.data.cast import CastOptions

target = pa.schema([
    pa.field("id", pa.int64()),
    pa.field("amount", pa.float64()),
])
raw = pa.table({"id": ["1", "2"], "amount": ["10.5", "20.0"]})
out = cast_arrow_tabular(raw, CastOptions(target_field=target))

Retry + parallel

from yggdrasil.pyutils import retry, parallelize

@retry(tries=5, delay=0.5, backoff=2.0)
def fetch(url: str) -> bytes: ...

@parallelize(max_workers=8)
def process(item: str) -> dict:
    return {"result": item.upper()}

results = list(process(["a", "b", "c"]))

Databricks — SQL with typed results

from yggdrasil.databricks.workspaces import Workspace
from yggdrasil.databricks.sql import SQLEngine

ws = Workspace(host="https://<workspace>", token="<pat>").connect()
engine = SQLEngine(catalog_name="main", schema_name="analytics", workspace=ws)

result = engine.execute("SELECT id, amount FROM transactions LIMIT 100")
df = result.to_pandas()
arrow_table = result.to_arrow_table()

Module map

Module Key exports
yggdrasil.arrow arrow_field_from_hint
yggdrasil.arrow.cast cast_arrow_tabular, cast_arrow_array
yggdrasil.data.cast CastOptions, convert, register_converter
yggdrasil.dataclasses dataclass_to_arrow_field
yggdrasil.pandas.cast cast_pandas_dataframe
yggdrasil.polars.cast cast_polars_dataframe, cast_polars_lazyframe
yggdrasil.spark.cast cast_spark_dataframe
yggdrasil.pyutils retry, parallelize
yggdrasil.concurrent JobPoolExecutor, Job
yggdrasil.requests YGGSession
yggdrasil.io BytesIO, Codec, MediaType
yggdrasil.deltalake DeltaTable
yggdrasil.databricks Workspace, SQLEngine, Cluster, NotebookConfig

Docs

Module reference →

Test

cd python/
pytest
ruff check .
mypy

Project details


Release history Release notifications | RSS feed

This version

0.4.9

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ygg-0.4.9.tar.gz (320.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ygg-0.4.9-py3-none-any.whl (370.5 kB view details)

Uploaded Python 3

File details

Details for the file ygg-0.4.9.tar.gz.

File metadata

  • Download URL: ygg-0.4.9.tar.gz
  • Upload date:
  • Size: 320.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.9 {"installer":{"name":"uv","version":"0.10.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for ygg-0.4.9.tar.gz
Algorithm Hash digest
SHA256 5ba12ce165a77394b9bc5f2ec7e78f6ae6c8b9bc4669f30772ed77ef7848c0d8
MD5 3d738ee3b3cf8c641251a9879cee28b9
BLAKE2b-256 a2e683c616ce2458b25ecfe68858ee3619067c46af7f14b9040577f61f78476c

See more details on using hashes here.

File details

Details for the file ygg-0.4.9-py3-none-any.whl.

File metadata

  • Download URL: ygg-0.4.9-py3-none-any.whl
  • Upload date:
  • Size: 370.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.9 {"installer":{"name":"uv","version":"0.10.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for ygg-0.4.9-py3-none-any.whl
Algorithm Hash digest
SHA256 7f9317bf19b7edc77a55fa338e4ab12bbd7accb189e21d2ad5a07c1c33ae80d8
MD5 9a5d242ea72295b8a3c876645e200dd2
BLAKE2b-256 7f55f783186684ce51e393e9638bccfcb3c1af44b508d099e0e8ec62e5ef40a9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page