pytest plugin to run the tests with support of pyspark.
Project description
pytest-spark
############
pytest_ plugin to run the tests with support of pyspark (`Apache Spark`_).
This plugin will allow to specify SPARK_HOME directory in ``pytest.ini``
and thus to make "pyspark" importable in your tests which are executed
by pytest.
You can also define "spark_options" in ``pytest.ini`` to customize pyspark,
including "spark.jars.packages" option which allows to load external
libraries (e.g. "com.databricks:spark-xml").
pytest-spark provides session scope fixtures ``spark_context`` and
``spark_session`` which can be used in your tests.
Install
=======
.. code-block:: shell
$ pip install pytest-spark
Usage
=====
Set Spark location
------------------
To run tests with required spark_home location you need to define it by
using one of the following methods:
1. Specify command line option "--spark_home"::
$ pytest --spark_home=/opt/spark
2. Add "spark_home" value to ``pytest.ini`` in your project directory::
[pytest]
spark_home = /opt/spark
3. Set the "SPARK_HOME" environment variable.
pytest-spark will try to import ``pyspark`` from provided location.
.. note::
"spark_home" will be read in the specified order. i.e. you can
override ``pytest.ini`` value by command line option.
Customize spark_options
-----------------------
Just define "spark_options" in your ``pytest.ini``, e.g.:
[pytest]
spark_home = /opt/spark
spark_options =
spark.app.name: my-pytest-spark-tests
spark.executor.instances: 1
spark.jars.packages: com.databricks:spark-xml_2.12:0.5.0
Using the ``spark_context`` fixture
-----------------------------------
Use fixture ``spark_context`` in your tests as a regular pyspark fixture.
SparkContext instance will be created once and reused for the whole test
session.
Example::
def test_my_case(spark_context):
test_rdd = spark_context.parallelize([1, 2, 3, 4])
# ...
Using the ``spark_session`` fixture (Spark 2.0 and above)
---------------------------------------------------------
Use fixture ``spark_session`` in your tests as a regular pyspark fixture.
A SparkSession instance with Hive support enabled will be created once and reused for the whole test
session.
Example::
def test_spark_session_dataframe(spark_session):
test_df = spark_session.createDataFrame([[1,3],[2,4]], "a: int, b: int")
# ...
.. _pytest: http://pytest.org/
.. _Apache Spark: https://spark.apache.org/
############
pytest_ plugin to run the tests with support of pyspark (`Apache Spark`_).
This plugin will allow to specify SPARK_HOME directory in ``pytest.ini``
and thus to make "pyspark" importable in your tests which are executed
by pytest.
You can also define "spark_options" in ``pytest.ini`` to customize pyspark,
including "spark.jars.packages" option which allows to load external
libraries (e.g. "com.databricks:spark-xml").
pytest-spark provides session scope fixtures ``spark_context`` and
``spark_session`` which can be used in your tests.
Install
=======
.. code-block:: shell
$ pip install pytest-spark
Usage
=====
Set Spark location
------------------
To run tests with required spark_home location you need to define it by
using one of the following methods:
1. Specify command line option "--spark_home"::
$ pytest --spark_home=/opt/spark
2. Add "spark_home" value to ``pytest.ini`` in your project directory::
[pytest]
spark_home = /opt/spark
3. Set the "SPARK_HOME" environment variable.
pytest-spark will try to import ``pyspark`` from provided location.
.. note::
"spark_home" will be read in the specified order. i.e. you can
override ``pytest.ini`` value by command line option.
Customize spark_options
-----------------------
Just define "spark_options" in your ``pytest.ini``, e.g.:
[pytest]
spark_home = /opt/spark
spark_options =
spark.app.name: my-pytest-spark-tests
spark.executor.instances: 1
spark.jars.packages: com.databricks:spark-xml_2.12:0.5.0
Using the ``spark_context`` fixture
-----------------------------------
Use fixture ``spark_context`` in your tests as a regular pyspark fixture.
SparkContext instance will be created once and reused for the whole test
session.
Example::
def test_my_case(spark_context):
test_rdd = spark_context.parallelize([1, 2, 3, 4])
# ...
Using the ``spark_session`` fixture (Spark 2.0 and above)
---------------------------------------------------------
Use fixture ``spark_session`` in your tests as a regular pyspark fixture.
A SparkSession instance with Hive support enabled will be created once and reused for the whole test
session.
Example::
def test_spark_session_dataframe(spark_session):
test_df = spark_session.createDataFrame([[1,3],[2,4]], "a: int, b: int")
# ...
.. _pytest: http://pytest.org/
.. _Apache Spark: https://spark.apache.org/
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
pytest-spark-0.5.0.tar.gz
(5.1 kB
view details)
Built Distribution
File details
Details for the file pytest-spark-0.5.0.tar.gz
.
File metadata
- Download URL: pytest-spark-0.5.0.tar.gz
- Upload date:
- Size: 5.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.19.1 setuptools/40.8.0 requests-toolbelt/0.8.0 tqdm/4.24.0 CPython/3.7.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 44b02899c4f15cca2e23abd4dd2f7016fdbbada8bf28d9fe067028b3c1e0a7ff |
|
MD5 | f96eb13ceecbc65670355d469e1eb3f5 |
|
BLAKE2b-256 | b45534651649e64bf5a4eb127bffc1c6d22e9d632f4f073d61cfb1ad43459ca5 |
File details
Details for the file pytest_spark-0.5.0-py3-none-any.whl
.
File metadata
- Download URL: pytest_spark-0.5.0-py3-none-any.whl
- Upload date:
- Size: 6.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.19.1 setuptools/40.8.0 requests-toolbelt/0.8.0 tqdm/4.24.0 CPython/3.7.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1410a5034afaa92f2a724a40b2fbc1091e7b6039421a8fa7242f01be4927d80f |
|
MD5 | 48b3e1991f497446f5912df70b351da4 |
|
BLAKE2b-256 | 381e592492795de80bed9d0601293c8712425fdd7afaa8ba25417a5d14b02050 |