pytest-spark·PyPI

pytest plugin to run the tests with support of pyspark.

These details have not been verified by PyPI

Project links

Homepage

Project description

pytest plugin to run the tests with support of pyspark (Apache Spark).

This plugin will allow to specify SPARK_HOME directory in pytest.ini and thus to make “pyspark” importable in your tests which are executed by pytest.

You can also define “spark_options” in pytest.ini to customize pyspark, including “spark.jars.packages” option which allows to load external libraries (e.g. “com.databricks:spark-xml”).

pytest-spark provides session scope fixtures spark_context and spark_session which can be used in your tests.

Note: no need to define SPARK_HOME if you’ve installed pyspark using pip (e.g. pip install pyspark) - it should be already importable. In this case just don’t define SPARK_HOME neither in pytest (pytest.ini / --spark_home) nor as environment variable.

Install

$ pip install pytest-spark

Usage

Set Spark location

To run tests with required spark_home location you need to define it by using one of the following methods:

Specify command line option “–spark_home”:
```
$ pytest --spark_home=/opt/spark
```
Add “spark_home” value to pytest.ini in your project directory:
```
[pytest]
spark_home = /opt/spark
```
Set the “SPARK_HOME” environment variable.

pytest-spark will try to import pyspark from provided location.

Customize spark_options

Just define “spark_options” in your pytest.ini, e.g.:

[pytest]
spark_home = /opt/spark
spark_options =
    spark.app.name: my-pytest-spark-tests
    spark.executor.instances: 1
    spark.jars.packages: com.databricks:spark-xml_2.12:0.5.0

Using the spark_context fixture

Use fixture spark_context in your tests as a regular pyspark fixture. SparkContext instance will be created once and reused for the whole test session.

Example:

def test_my_case(spark_context):
    test_rdd = spark_context.parallelize([1, 2, 3, 4])
    # ...

Warning: spark_context isn’t supported with Spark Connect functionality!

Using the spark_session fixture (Spark 2.0 and above)

Use fixture spark_session in your tests as a regular pyspark fixture. A SparkSession instance with Hive support enabled will be created once and reused for the whole test session.

Example:

def test_spark_session_dataframe(spark_session):
    test_df = spark_session.createDataFrame([[1,3],[2,4]], "a: int, b: int")
    # ...

Overriding default parameters of the spark_session fixture

By default spark_session will be loaded with the following configurations :

Example:

{
    'spark.app.name': 'pytest-spark',
    'spark.default.parallelism': 1,
    'spark.dynamicAllocation.enabled': 'false',
    'spark.executor.cores': 1,
    'spark.executor.instances': 1,
    'spark.io.compression.codec': 'lz4',
    'spark.rdd.compress': 'false',
    'spark.sql.shuffle.partitions': 1,
    'spark.shuffle.compress': 'false',
    'spark.sql.catalogImplementation': 'hive',
}

You can override some of these parameters in your pytest.ini. For example, removing Hive Support for the spark session :

Example:

[pytest]
spark_home = /opt/spark
spark_options =
    spark.sql.catalogImplementation: in-memory

Using spark_session fixture with Spark Connect

pytest-spark also works with Spark Connect that allows to execute code on the remote servers. You need Spark 3.4+ with pyspark installed with the connect extension (pyspark[connect] for PySpark 3.4+), or install pyspark-connect package (for PySpark 4.x).

It could be enabled with one of the following options:

by setting SPARK_REMOTE environment variable to the URL of Spark Connect server.
specifying URL of Spark Connect server as spark_connect_url option in pytest.ini.
with --spark_connect_url command-line argument.

Note: in this mode, some of the Spark configurations will be ignored, such as, spark.executor.cores, spark.executor.instances, etc. that doesn’t have an effect on the existing Spark Session.

Development

Tests

Run tests locally:

$ docker-compose up --build

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

0.8.0

May 21, 2025

0.7.0

Mar 21, 2025

0.6.0

Feb 23, 2020

0.5.2

Jun 15, 2019

0.5.1

May 9, 2019

0.5.0

May 9, 2019

0.4.5

Aug 17, 2018

0.4.4

Dec 2, 2017

0.4.3

Dec 1, 2017

0.4.1

Dec 1, 2017

0.4.0

Aug 3, 2017

0.3.1

Jun 29, 2017

0.3.0

Jun 27, 2017

0.2.0

Feb 22, 2017

0.1.2

Dec 30, 2016

0.1.1

Dec 30, 2016

0.1.0

Dec 29, 2016

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pytest_spark-0.8.0.tar.gz (7.4 kB view details)

Uploaded May 21, 2025 Source

Built Distribution

pytest_spark-0.8.0-py3-none-any.whl (7.7 kB view details)

Uploaded May 21, 2025 Python 3

File details

Details for the file pytest_spark-0.8.0.tar.gz.

File metadata

Download URL: pytest_spark-0.8.0.tar.gz
Upload date: May 21, 2025
Size: 7.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.6.9

File hashes

Hashes for pytest_spark-0.8.0.tar.gz
Algorithm	Hash digest
SHA256	`26f57d10862fa7cd34567a217e4a3b965d6c4927bd15b196dfde1c6ee5eaf5b1`
MD5	`4fe5326bb85d3f315d5c37cdf214bb77`
BLAKE2b-256	`793b712c34be12b7ca353032ab42239e46eea769772adf633d10df7b8362f014`

See more details on using hashes here.

File details

Details for the file pytest_spark-0.8.0-py3-none-any.whl.

File metadata

Download URL: pytest_spark-0.8.0-py3-none-any.whl
Upload date: May 21, 2025
Size: 7.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.6.9

File hashes

Hashes for pytest_spark-0.8.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`1c7e4e1f6dfdb842f5f651605c1ac80c6cccfff458b22ce3319fcdc26ed8e940`
MD5	`28010ff7bae5abec1cfe813dc901a6b0`
BLAKE2b-256	`77f6d226e99f6545890337a82bce6a044d9039756e070709bcfecb9039a67f68`

See more details on using hashes here.

pytest-spark 0.8.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Install

Usage

Set Spark location

Customize spark_options

Using the spark_context fixture

Using the spark_session fixture (Spark 2.0 and above)

Overriding default parameters of the spark_session fixture

Using spark_session fixture with Spark Connect

Development

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes