Unity Catalog pyspark fixtures

These details have not been verified by PyPI

Project description

pytest-mock-unity-catalog

Pytest plugin that provides PySpark fixtures for testing code that reads and writes Unity Catalog tables — without a live Databricks cluster. Table operations are redirected to a local Delta directory so tests run fully offline.

Installation

pip install pytest-mock-unity-catalog

Pytest discovers the plugin automatically via its entry point. No imports or conftest.py changes are needed in the consuming project.

Fixtures

`spark`

A session-scoped SparkSession configured for local testing with Delta Lake enabled.

def test_something(spark):
    df = spark.createDataFrame([(1, "a")], ["id", "value"])
    assert df.count() == 1

By default uses delta-spark_4.1_2.13:4.1.0 (PySpark 4.1, Scala 2.13). Override via the SPARK_VERSION environment variable for other versions:

# PySpark 3.5 / Scala 2.12
SPARK_VERSION=2.12:3.2.1 pytest

# PySpark 4.0 / Scala 2.13
SPARK_VERSION=4.0_2.13:4.0.0 pytest

`mock_save_as_table`

Patches DataFrame.write.saveAsTable to write a Delta table to a local temp directory instead of Unity Catalog. The Unity Catalog-style three-part name (catalog.schema.table) is mapped to a directory path.

def test_write(spark, mock_save_as_table):
    df = spark.createDataFrame([(1, "a")], ["id", "value"])
    df.write.saveAsTable("my_catalog.my_schema.my_table")  # writes locally

`mock_read_table`

Patches both spark.read.table and spark.table to read from the same local Delta path that mock_save_as_table writes to. Use both fixtures together to round-trip through a table.

def test_read(spark, mock_read_table):
    df = spark.read.table("my_catalog.my_schema.my_table")
    assert df.count() == 2

    df2 = spark.table("my_catalog.my_schema.my_table")
    assert df2.count() == 2

`local_table_base_path`

The Path to the session-scoped temp directory used as the root for all table storage. Useful for asserting on the filesystem directly or for sharing the path in custom fixtures.

def test_path(local_table_base_path):
    assert (local_table_base_path / "my_catalog" / "my_schema" / "my_table").exists()

`mock_volume`

Redirects all /Volumes/... filesystem access to a local temp directory for the duration of the test. The fixture yields the local base Path so tests can seed files before exercising the code under test.

Intercepted access patterns:

Pattern	Mechanism
`open("/Volumes/...")`	patches `builtins.open`
`open(Path("/Volumes/..."))`	patches `builtins.open` via PathLike
`Path("/Volumes/...").read_text()`	patches `Path.__fspath__`
`Path("/Volumes/...").write_text(...)`	patches `Path.__fspath__`
`Path("/Volumes/...").exists()` / `.stat()` / `.mkdir()`	patches `Path.__fspath__`
`pd.read_csv("/Volumes/...")`	pandas delegates to `open()`
`pd.DataFrame.to_csv("/Volumes/...")`	pandas delegates to `open()`

Limitation: binary/columnar readers that bypass Python's open() — e.g. pandas.read_parquet backed by pyarrow — are not intercepted.

Parent directories under the temp root are created automatically, so no explicit mkdir is needed before writing.

def test_read_volume(mock_volume):
    # Seed a file at the equivalent of /Volumes/cat/schema/vol/data.csv
    seed = mock_volume / "cat" / "schema" / "vol" / "data.csv"
    seed.parent.mkdir(parents=True, exist_ok=True)
    seed.write_text("id,value\n1,a\n2,b\n")

    # Code under test uses the real /Volumes path — it is transparently redirected
    import pandas as pd
    df = pd.read_csv("/Volumes/cat/schema/vol/data.csv")
    assert len(df) == 2

Works with pathlib.Path too:

def test_write_volume(mock_volume):
    from pathlib import Path

    Path("/Volumes/cat/schema/vol/out.txt").write_text("hello")

    result = Path("/Volumes/cat/schema/vol/out.txt").read_text()
    assert result == "hello"

`volume_base_path`

The session-scoped Path used as the root for all volume storage. Injected automatically into mock_volume; only needed directly when building custom fixtures on top of the volume base.

`mock_dbutils`

Injects a dbutils-compatible object into builtins for the duration of the test, so code under test can reference dbutils as a bare name — exactly as it does inside a Databricks notebook — without any import or fixture argument.

All dbutils.fs.* calls that target /Volumes/... paths are redirected to the same local temp directory as mock_volume, so both open() and dbutils.fs.* access the same files.

# Production code — no imports, bare dbutils reference
def list_files(path):
    return dbutils.fs.ls(path)

# Test — just request the fixture; dbutils is available globally
def test_list(mock_dbutils):
    dbutils.fs.put("/Volumes/cat/schema/vol/data.txt", "hello", overwrite=True)
    assert any(e.name == "data.txt" for e in list_files("/Volumes/cat/schema/vol"))

The fixture also yields the mock object, so tests can reference it via the parameter name when that reads more clearly.

Supported dbutils.fs methods:

Method	Signature
`ls`	`ls(path) → list[FileInfo]`
`put`	`put(path, contents, overwrite=False) → bool`
`head`	`head(path, max_bytes=65536) → str`
`mkdirs`	`mkdirs(path) → bool`
`rm`	`rm(path, recurse=False) → bool`
`cp`	`cp(from_path, to_path, recurse=False) → bool`
`mv`	`mv(from_path, to_path, recurse=False) → bool`

ls returns a list of FileInfo(path, name, size, modificationTime) namedtuples that match the Databricks shape. Directory entries have a trailing / in name and size=0.

Files seeded via mock_volume (or via open()) are immediately visible to dbutils.fs, and vice versa:

def test_cross_access(mock_volume, mock_dbutils):
    # Write via pathlib, read via dbutils
    (mock_volume / "cat" / "schema" / "vol").mkdir(parents=True, exist_ok=True)
    (mock_volume / "cat" / "schema" / "vol" / "file.txt").write_text("shared")
    assert dbutils.fs.head("/Volumes/cat/schema/vol/file.txt") == "shared"

    # Write via dbutils, read via open()
    dbutils.fs.put("/Volumes/cat/schema/vol/out.txt", "also shared", overwrite=True)
    with open("/Volumes/cat/schema/vol/out.txt") as f:
        assert f.read() == "also shared"

On Databricks the real DBUtils(spark) instance is injected instead, so the same tests run against the live Unity Catalog volume without modification.

Example: full round-trip

def test_round_trip(spark, mock_save_as_table, mock_read_table):
    df = spark.createDataFrame([(1, "a"), (2, "b")], ["id", "value"])
    df.write.saveAsTable("my_catalog.my_schema.my_table")

    result = spark.read.table("my_catalog.my_schema.my_table")
    assert result.count() == 2

Databricks / on-cluster usage

When tests run inside a Databricks notebook or job (i.e. DATABRICKS_RUNTIME_VERSION is set), the plugin detects this automatically:

spark returns the active SparkSession instead of creating a local one.
mock_read_table is a no-op — spark.read.table hits Unity Catalog as normal.
mock_save_as_table is a no-op — df.write.saveAsTable writes to Unity Catalog as normal. The table is dropped with DROP TABLE IF EXISTS in teardown.
mock_volume is a no-op — /Volumes/... paths reach the real Unity Catalog volume.

No code changes are needed; the same tests run locally (mocked) and on Databricks (real).

How it works

mock_save_as_table and mock_read_table patch PySpark's DataFrameWriter.saveAsTable and DataFrameReader.table for the duration of the test. The Unity Catalog table name is converted to a filesystem path by replacing . separators with /:

my_catalog.my_schema.my_table  →  <tmp>/my_catalog/my_schema/my_table

The temp directory is managed by pytest (tmp_path_factory) and lives under the OS temp space (e.g. /var/folders/.../pytest-of-<user>/pytest-<N>/). Pytest retains the last three runs before pruning.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

1.0.1

Mar 12, 2026

1.0.0

Mar 1, 2026

0.0.6

Mar 1, 2026

This version

0.0.5

Feb 28, 2026

0.0.4

Feb 28, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pytest_mock_unity_catalog-0.0.5.tar.gz (31.4 kB view details)

Uploaded Feb 28, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pytest_mock_unity_catalog-0.0.5-py3-none-any.whl (10.3 kB view details)

Uploaded Feb 28, 2026 Python 3

File details

Details for the file pytest_mock_unity_catalog-0.0.5.tar.gz.

File metadata

Download URL: pytest_mock_unity_catalog-0.0.5.tar.gz
Upload date: Feb 28, 2026
Size: 31.4 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for pytest_mock_unity_catalog-0.0.5.tar.gz
Algorithm	Hash digest
SHA256	`5194b188cee30bbd45468c06d4cd667961ad691a36f5d259ad014020d297bc96`
MD5	`fb06be5812a2ffa324d0fe2961913aeb`
BLAKE2b-256	`b18426437651a73796b9a20f08ae066786d4ea2bebc957b732c8d9a1dab03c47`

See more details on using hashes here.

Provenance

The following attestation bundles were made for pytest_mock_unity_catalog-0.0.5.tar.gz:

Publisher: run_build.yml on marianreuss/pytest-mock-unity-catalog

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: pytest_mock_unity_catalog-0.0.5.tar.gz
- Subject digest: 5194b188cee30bbd45468c06d4cd667961ad691a36f5d259ad014020d297bc96
- Sigstore transparency entry: 1005352780
- Sigstore integration time: Feb 28, 2026
Source repository:
- Permalink: marianreuss/pytest-mock-unity-catalog@81206e820f299af478da50de50344f09cc1fafb1
- Branch / Tag: refs/tags/v0.0.5
- Owner: https://github.com/marianreuss
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: run_build.yml@81206e820f299af478da50de50344f09cc1fafb1
- Trigger Event: push

File details

Details for the file pytest_mock_unity_catalog-0.0.5-py3-none-any.whl.

File metadata

Download URL: pytest_mock_unity_catalog-0.0.5-py3-none-any.whl
Upload date: Feb 28, 2026
Size: 10.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for pytest_mock_unity_catalog-0.0.5-py3-none-any.whl
Algorithm	Hash digest
SHA256	`a0c5120c414dffa22506966bfc9d2f557ee881c70997a0521079c0311588b4f8`
MD5	`033b648eb5f94f9aeee543ddfea28478`
BLAKE2b-256	`83c075ecc8418113e5c4d4a31a8de20a33eb5a429c833591f669681ed17e3141`

See more details on using hashes here.

Provenance

The following attestation bundles were made for pytest_mock_unity_catalog-0.0.5-py3-none-any.whl:

Publisher: run_build.yml on marianreuss/pytest-mock-unity-catalog

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: pytest_mock_unity_catalog-0.0.5-py3-none-any.whl
- Subject digest: a0c5120c414dffa22506966bfc9d2f557ee881c70997a0521079c0311588b4f8
- Sigstore transparency entry: 1005352781
- Sigstore integration time: Feb 28, 2026
Source repository:
- Permalink: marianreuss/pytest-mock-unity-catalog@81206e820f299af478da50de50344f09cc1fafb1
- Branch / Tag: refs/tags/v0.0.5
- Owner: https://github.com/marianreuss
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: run_build.yml@81206e820f299af478da50de50344f09cc1fafb1
- Trigger Event: push

pytest-mock-unity-catalog 0.0.5

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

pytest-mock-unity-catalog

Installation

Fixtures

`spark`

`mock_save_as_table`

`mock_read_table`

`local_table_base_path`

`mock_volume`

`volume_base_path`

`mock_dbutils`

Example: full round-trip

Databricks / on-cluster usage

How it works

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance