Skip to main content

Create safer spark pipelines

Project description

Safer Stubs for PySpark

This is a project to create safer stubs for PySpark. The goal is to make it easier to use PySpark in a safer way, without having to worry about the underlying implementation. It implements only subset of working PySpark APIs with goal to have a minimal set of functions that can be used in a safer way.

How to use

To use the stubs, you need to install the package:

pip install pyspark-safestubs

Then, you can use the stubs in your code:

from pyspark.sql import SparkSession
from pyspark.sql import functions as F

def some_transform(df: DataFrame['col1', 'col2']):
    df = df.withColumn('col5', F.col('col1').cast('int'))
    # df has type DataFrame['col1', 'col2', 'col5']
    return df

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyspark_safestubs-0.1.4.tar.gz (119.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pyspark_safestubs-0.1.4-py3-none-any.whl (207.9 kB view details)

Uploaded Python 3

File details

Details for the file pyspark_safestubs-0.1.4.tar.gz.

File metadata

  • Download URL: pyspark_safestubs-0.1.4.tar.gz
  • Upload date:
  • Size: 119.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.11

File hashes

Hashes for pyspark_safestubs-0.1.4.tar.gz
Algorithm Hash digest
SHA256 359582ba2f8ab416ca4fd1e546be143fb3e5a40bbf621bef61957763dfca6d42
MD5 810571d4118c990392fc826199572054
BLAKE2b-256 cd3cf8ab4ed719c217683757bc5c332558cd139a405f9391d171408892190d1a

See more details on using hashes here.

File details

Details for the file pyspark_safestubs-0.1.4-py3-none-any.whl.

File metadata

File hashes

Hashes for pyspark_safestubs-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 ffca7fddee20f4d3036114b512fee5e43bfbf2b8e592c68a7c7cb28d9358576f
MD5 897a48207bd92506db58730f72c8d2a6
BLAKE2b-256 8d4353733bf6bebfa901dc649890544b256b1dc13283647732542ad8ed3cdb39

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page