Skip to main content

A collection of the Apache Spark stub files

Project description

Build Status PyPI version

A collection of the Apache Spark stub files. These files were generated by stubgen and manually edited to include accurate type hints.

Tests and configuration files have been originally contributed to the Typeshed project. Please refer to its contributors list and license for details.

Motivation

  • Static error detection (see SPARK-20631)

    SPARK-20631

  • Improved completion for chained method calls.

    Syntax completion

Installation and usage

Please note that the guidelines for distribution of type information is still work in progress (PEP 561 - Distributing and Packaging Type Information). Currently installation script overlays existing Spark installations (pyi stub files are copied next to their py counterparts in the PySpark installation directory). If this approach is not acceptable you can stub files to the search path manually.

According to PEP 484:

Third-party stub packages can use any location for stub storage. Type checkers should search for them using PYTHONPATH.

Moreover:

Third-party stub packages can use any location for stub storage. Type checkers should search for them using PYTHONPATH. A default fallback directory that is always checked is shared/typehints/python3.5/ (or 3.6, etc.)

Please check usage before proceeding.

The package is available on PYPI:

pip install pyspark-stubs

Depending on your environment you might also need a type checker, like Mypy or Pytype.

  • PyCharm - Works out-of-the-box, though as of today (PyCharm 2018.2.4) built-in type checker is somewhat limited compared to MyPy.

  • Atom - Requires atom-mypy or equivalent.

  • Jupyter Notebooks - It is possible to use magics to type check directly in the notebook.

  • Environment independent - Just use your favorite checker directly, optionally combined with tool like entr.

Version Compatibility

Package versions follow PySpark versions with exception to maintenance releases - i.e. pyspark-stubs==2.3.0 should be compatible with pyspark>=2.3.0,<2.4.0. Maintenance releases (post1, post2, …, postN) are reserved for internal annotations updates.

API Coverage

Module

Dynamically typed

Statically typed

Notes

pyspark

pyspark.accumulators

pyspark.broadcast

Mixed

pyspark.cloudpickle

Internal

pyspark.conf

pyspark.context

pyspark.daemon

Internal

pyspark.files

pyspark.find_spark_home

Internal

pyspark.heapq3

Internal

pyspark.java_gateway

Internal

pyspark.join

pyspark.ml

pyspark.ml.base

pyspark.ml.classification

pyspark.ml.clustering

pyspark.ml.common

pyspark.ml.evaluation

pyspark.ml.feature

pyspark.ml.fpm

pyspark.ml.image

pyspark.ml.linalg

pyspark.ml.param

pyspark.ml.param._shared_params_code_gen

Internal

pyspark.ml.param.shared

pyspark.ml.pipeline

pyspark.ml.recommendation

pyspark.ml.regression

pyspark.ml.stat

pyspark.ml.tests

Tests

pyspark.ml.tuning

pyspark.ml.util

pyspark.ml.wrapper

Mixed

pyspark.mllib

pyspark.mllib.classification

pyspark.mllib.clustering

pyspark.mllib.common

pyspark.mllib.evaluation

pyspark.mllib.feature

pyspark.mllib.fpm

pyspark.mllib.linalg

pyspark.mllib.linalg.distributed

pyspark.mllib.random

pyspark.mllib.recommendation

pyspark.mllib.regression

pyspark.mllib.stat

pyspark.mllib.stat.KernelDensity

pyspark.mllib.stat._statistics

pyspark.mllib.stat.distribution

pyspark.mllib.stat.test

pyspark.mllib.tests

Tests

pyspark.mllib.tree

pyspark.mllib.util

pyspark.profiler

pyspark.rdd

pyspark.rddsampler

pyspark.resultiterable

pyspark.serializers

pyspark.shell

Internal

pyspark.shuffle

Internal

pyspark.sql

pyspark.sql.catalog

pyspark.sql.column

pyspark.sql.conf

pyspark.sql.context

pyspark.sql.dataframe

pyspark.sql.functions

pyspark.sql.group

pyspark.sql.readwriter

pyspark.sql.session

pyspark.sql.streaming

pyspark.sql.tests

Tests

pyspark.sql.types

pyspark.sql.utils

pyspark.sql.window

pyspark.statcounter

pyspark.status

pyspark.storagelevel

pyspark.streaming

pyspark.streaming.context

pyspark.streaming.dstream

pyspark.streaming.flume

pyspark.streaming.kafka

pyspark.streaming.kinesis

pyspark.streaming.listener

pyspark.streaming.tests

Tests

pyspark.streaming.util

pyspark.taskcontext

pyspark.tests

Tests

pyspark.traceback_utils

Internal

pyspark.util

pyspark.version

pyspark.worker

Internal

Disclaimer

Apache Spark, Spark, PySpark, Apache, and the Spark logo are trademarks of The Apache Software Foundation. This project is not owned, endorsed, or sponsored by The Apache Software Foundation.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyspark-stubs-2.4.0.post5.tar.gz (53.9 kB view hashes)

Uploaded Source

Built Distribution

pyspark_stubs-2.4.0.post5-py3-none-any.whl (79.8 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page