Skip to main content

Create safer spark pipelines

Project description

Safer Stubs for PySpark

This is a project to create safer stubs for PySpark. The goal is to make it easier to use PySpark in a safer way, without having to worry about the underlying implementation. It implements only subset of working PySpark APIs with goal to have a minimal set of functions that can be used in a safer way.

How to use

To use the stubs, you need to install the package:

pip install pyspark-safestubs

Then, you can use the stubs in your code:

from pyspark.sql import SparkSession
from pyspark.sql import functions as F

def some_transform(df: DataFrame['col1', 'col2']):
    df = df.withColumn('col5', F.col('col1').cast('int'))
    # df has type DataFrame['col1', 'col2', 'col5']
    return df

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyspark_safestubs-0.1.1.tar.gz (119.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pyspark_safestubs-0.1.1-py3-none-any.whl (207.9 kB view details)

Uploaded Python 3

File details

Details for the file pyspark_safestubs-0.1.1.tar.gz.

File metadata

  • Download URL: pyspark_safestubs-0.1.1.tar.gz
  • Upload date:
  • Size: 119.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.11

File hashes

Hashes for pyspark_safestubs-0.1.1.tar.gz
Algorithm Hash digest
SHA256 19a0c79f65918ad1b27e3aac1b8786f4942c4a48307da934c1611d04367a0d79
MD5 8139247a3277b3ec4d879402975dbc19
BLAKE2b-256 f8ed5b04d0ce811146199780863c58dfc37d676ec3a8273d176a8ee5cd19e76f

See more details on using hashes here.

File details

Details for the file pyspark_safestubs-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for pyspark_safestubs-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 5e09d1a400dcdb0a8b03350d69658ac40402d67d5c3a80d49c4961b407d57947
MD5 1071b487608ae2318763a12e3ca5d336
BLAKE2b-256 eee6ed28d1a472f429047d9f6e38516dc874f51f777159fad6f78e32a9dde23e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page