Skip to main content

Create safer spark pipelines

Project description

Safer Stubs for PySpark

This is a project to create safer stubs for PySpark. The goal is to make it easier to use PySpark in a safer way, without having to worry about the underlying implementation. It implements only subset of working PySpark APIs with goal to have a minimal set of functions that can be used in a safer way.

How to use

To use the stubs, you need to install the package:

pip install pyspark-safestubs

Then, you can use the stubs in your code:

from pyspark.sql import SparkSession
from pyspark.sql import functions as F

def some_transform(df: DataFrame['col1', 'col2']):
    df = df.withColumn('col5', F.col('col1').cast('int'))
    # df has type DataFrame['col1', 'col2', 'col5']
    return df

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyspark_safestubs-0.1.3.tar.gz (119.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pyspark_safestubs-0.1.3-py3-none-any.whl (207.9 kB view details)

Uploaded Python 3

File details

Details for the file pyspark_safestubs-0.1.3.tar.gz.

File metadata

  • Download URL: pyspark_safestubs-0.1.3.tar.gz
  • Upload date:
  • Size: 119.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.11

File hashes

Hashes for pyspark_safestubs-0.1.3.tar.gz
Algorithm Hash digest
SHA256 808d10a2d9a6f8dd7ea01e18bababea055cee74e11c6c9aaa35344ccec585666
MD5 fb1929bd440f9f2aae19a9a8ee6a2f6d
BLAKE2b-256 3cd96a427861d1b1fe2c35602bbffa732fe26d3e2b7cb39d9b00f10b952e533d

See more details on using hashes here.

File details

Details for the file pyspark_safestubs-0.1.3-py3-none-any.whl.

File metadata

File hashes

Hashes for pyspark_safestubs-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 5eb9f0baa19286203002f59819cb91a9795584d48a23bacd7ef36c71e8f8bddf
MD5 5572ba390eebd15335d6108dc181b8bc
BLAKE2b-256 5a60e8691303250e6f53f3ccd2f05f475f7dc4bc09545c495e767f93665c46e6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page