Skip to main content

Create safer spark pipelines

Project description

Safer Stubs for PySpark

This is a project to create safer stubs for PySpark. The goal is to make it easier to use PySpark in a safer way, without having to worry about the underlying implementation. It implements only subset of working PySpark APIs with goal to have a minimal set of functions that can be used in a safer way.

How to use

To use the stubs, you need to install the package:

pip install pyspark-safestubs

Then, you can use the stubs in your code:

from pyspark.sql import SparkSession
from pyspark.sql import functions as F

def some_transform(df: DataFrame['col1', 'col2']):
    df = df.withColumn('col5', F.col('col1').cast('int'))
    # df has type DataFrame['col1', 'col2', 'col5']
    return df

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyspark_safestubs-0.1.5.tar.gz (119.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pyspark_safestubs-0.1.5-py3-none-any.whl (207.9 kB view details)

Uploaded Python 3

File details

Details for the file pyspark_safestubs-0.1.5.tar.gz.

File metadata

  • Download URL: pyspark_safestubs-0.1.5.tar.gz
  • Upload date:
  • Size: 119.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.11

File hashes

Hashes for pyspark_safestubs-0.1.5.tar.gz
Algorithm Hash digest
SHA256 7da983bbf677a1b76d87b1e055eaac39dfba0a651d788cd7ec56871bce19fe15
MD5 2cfe3f3001ae2dc2d25d8515cd9ece0b
BLAKE2b-256 8b9a5c5a19d8c34269ea87422dc1de8191bebdcd076db20d722b75a6ed9b28b8

See more details on using hashes here.

File details

Details for the file pyspark_safestubs-0.1.5-py3-none-any.whl.

File metadata

File hashes

Hashes for pyspark_safestubs-0.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 a544b45674e76332fffb7f66238bfcf7a2a3b5133350056ac82985f53346d59e
MD5 f2f9d509dad25381c98c38fd38c0ffc7
BLAKE2b-256 f48559930a4a2cc7b641242f53c27710aaab7d4d40418068b475a2a24d4bebd0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page