Skip to main content

Create safer spark pipelines

Project description

Safer Stubs for PySpark

This is a project to create safer stubs for PySpark. The goal is to make it easier to use PySpark in a safer way, without having to worry about the underlying implementation. It implements only subset of working PySpark APIs with goal to have a minimal set of functions that can be used in a safer way.

How to use

To use the stubs, you need to install the package:

pip install pyspark-safestubs

Then, you can use the stubs in your code:

from pyspark.sql import SparkSession
from pyspark.sql import functions as F

def some_transform(df: DataFrame['col1', 'col2']):
    df = df.withColumn('col5', F.col('col1').cast('int'))
    # df has type DataFrame['col1', 'col2', 'col5']
    return df

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyspark_safestubs-0.1.2.tar.gz (119.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pyspark_safestubs-0.1.2-py3-none-any.whl (207.9 kB view details)

Uploaded Python 3

File details

Details for the file pyspark_safestubs-0.1.2.tar.gz.

File metadata

  • Download URL: pyspark_safestubs-0.1.2.tar.gz
  • Upload date:
  • Size: 119.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.11

File hashes

Hashes for pyspark_safestubs-0.1.2.tar.gz
Algorithm Hash digest
SHA256 8b3558f979d82bc153b9c5d65319d8009a51b5a6a643f1197b4fa2ad331636b4
MD5 238128c7a38f1cf4de06e60c2a386c0a
BLAKE2b-256 d9e9189ecc941c64ca90b2fc079ab7902eb39ae262f3bbe70c6375b5b17db515

See more details on using hashes here.

File details

Details for the file pyspark_safestubs-0.1.2-py3-none-any.whl.

File metadata

File hashes

Hashes for pyspark_safestubs-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 96e2c08cc2b2a0b8b67ecff87925ef8bdb4b10c9a151c1a01b404488aee5bdd5
MD5 5d78e856e7502142207b4bba9db53e1e
BLAKE2b-256 6b4e1ccb39c50a857213abfea545dc5bb2898eb69bcace90bdf5b7f521ab5079

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page