Serverless-first testing framework for Databricks notebooks with Asset Bundle support
Project description
Databricks Notebook Test Framework
A Python-based automated testing framework for Databricks notebooks with native serverless support and Databricks Asset Bundle integration.
Features
- ✅ Serverless-first - Automatic inline environment management for dependencies
- ✅ Databricks Asset Bundles - Auto-detects bundle projects and resolves workspace paths
- ✅ Simple, intuitive test pattern with setup/test/cleanup lifecycle
- ✅ Execute tests remotely on Databricks (serverless or cluster)
- ✅ Parallel test execution for faster test runs
- ✅ Clean developer workflow for writing tests
- ✅ JUnit XML results compatible with CI/CD pipelines
- ✅ Parameterized testing support
- ✅ Automatic test discovery (pytest-style patterns)
- ✅ CLI-driven with rich output
- ✅ Run multiple test classes in a single notebook
- ✅ Zero external test framework dependencies
Installation
# Install from source
pip install -e .
Or from PyPI (once published):
pip install dbx_test
Quick Start
1. Create a Test Notebook
Create a test notebook (e.g., tests/my_notebook_test.py):
from dbx_test import NotebookTestFixture
class TestMyNotebook(NotebookTestFixture):
def run_setup(self):
"""Setup code runs before tests"""
self.data = spark.createDataFrame([(1, "a"), (2, "b")], ["id", "value"])
self.data.createOrReplaceTempView("test_data")
def test_row_count(self):
"""Test that we have expected row count"""
result = spark.sql("SELECT * FROM test_data")
assert result.count() == 2, "Expected 2 rows"
def test_schema(self):
"""Test that schema is correct"""
result = spark.sql("SELECT * FROM test_data")
assert "id" in result.columns
assert "value" in result.columns
def run_cleanup(self):
"""Cleanup runs after all tests"""
spark.sql("DROP VIEW IF EXISTS test_data")
2. Scaffold Your Project
# Creates test file + config automatically
dbx_test scaffold my_feature
# For bundle projects, this detects the bundle and provides bundle-specific tips
3. Configure Your Environment
Create config/test_config.yml:
workspace:
# Use Databricks CLI profile
profile: "default"
cluster:
# Option 1: Use serverless (recommended, fastest)
# Leave empty for serverless with inline dependencies
# Install dependencies automatically
libraries:
- whl: "git+https://github.com/your-org/your-package.git"
- pypi:
package: "pandas==2.0.0"
# Option 2: Use pre-created environment (serverless)
# environment_key: "my_environment"
# Option 3: Use existing cluster
# cluster_id: "1234-567890-abcdef"
execution:
timeout: 600
parallel: false
reporting:
output_dir: ".dbx-test-results"
formats: ["junit", "console"]
4. Run Tests
For Databricks Asset Bundle Projects:
# Auto-detects bundle and resolves workspace path
dbx_test run --target dev --profile my-profile
# With custom subdirectory
dbx_test run --target dev --tests-dir src/tests --profile my-profile
For Non-Bundle Projects:
# Run tests from workspace path
dbx_test run --tests-dir /Workspace/Users/you@company.com/tests --profile my-profile
# Or from Repos
dbx_test run --tests-dir /Repos/production/my-project/tests --profile my-profile
Test Discovery: Automatically finds all notebooks matching test_* or *_test patterns (just like pytest!)
Databricks Asset Bundle Support
The framework automatically detects Databricks Asset Bundle projects and simplifies test execution:
Example Bundle Structure:
my_bundle/
├── databricks.yml
├── src/
│ └── my_code.py
└── tests/
├── test_feature_a.py
└── test_feature_b.py
Example databricks.yml:
bundle:
name: my_project
targets:
dev:
workspace:
host: https://your-workspace.cloud.databricks.com/
Run Tests:
# Framework auto-detects the bundle and constructs the workspace path
dbx_test run --target dev --profile my-profile
# Resolves to: /Workspace/Users/you@company.com/.bundle/my_project/dev/files/tests
Benefits:
- ✅ No manual workspace path configuration
- ✅ Works seamlessly with
databricks bundle deploy - ✅ Automatic path resolution based on target
- ✅ Supports custom test directories
Serverless Compute with Inline Dependencies
The framework automatically creates inline environments for serverless compute:
cluster:
# Dependencies are automatically installed in serverless environment
libraries:
- whl: "git+https://github.com/your-org/your-package.git"
- pypi:
package: "pandas==2.0.0"
- whl: "/Workspace/Shared/wheels/custom-1.0.0-py3-none-any.whl"
How it works:
- Framework detects serverless compute (no cluster_id specified)
- Creates inline environment with your dependencies
- Executes tests with all libraries installed
- Cleans up automatically
For production, you can pre-create environments:
cluster:
environment_key: "production_test_env" # Reference pre-created environment
See Serverless Environments Guide for details.
Interactive Notebook Development
# Run tests directly in a Databricks notebook
from dbx_test import NotebookTestFixture, run_notebook_tests
import json
class TestMyData(NotebookTestFixture):
def run_setup(self):
self.df = spark.createDataFrame([(1, "Alice")], ["id", "name"])
def test_count(self):
assert self.df.count() == 1
# Run tests (automatically discovers all test classes)
results = run_notebook_tests()
# Return results to CLI (required for remote execution)
dbutils.notebook.exit(json.dumps(results))
📘 See Notebook Usage Guide for detailed examples and patterns.
CLI Commands
dbx_test run
Execute tests remotely on Databricks.
Options:
--target TARGET- Databricks Asset Bundle target (auto-detects workspace path)--profile PROFILE- Databricks CLI profile to use--tests-dir DIR- Directory containing tests (workspace path or relative for bundles)--env ENV- Environment (dev/test/prod)--parallel- Enable parallel execution--output-format FORMAT- Output format (junit/console/json/html)--config PATH- Path to config file (default: config/test_config.yml)--verbose- Enable verbose output
Examples:
# Bundle project
dbx_test run --target dev --profile my-profile
# Workspace path
dbx_test run --tests-dir /Workspace/Users/you@company.com/tests --profile my-profile
# With multiple output formats
dbx_test run --target dev --profile prod \
--output-format junit \
--output-format html
dbx_test scaffold
Create a new test notebook from template.
# Create test and config files
dbx_test scaffold my_feature
# Detects bundle projects and provides bundle-specific instructions
dbx_test report
Generate test report from previous run.
# Generate report from latest run
dbx_test report --format junit
# Generate from specific run
dbx_test report --run-id <run_id> --format html
Configuration
See Configuration Guide for detailed configuration options.
Documentation
Getting Started
- Quick Start Guide - Get started in 5 minutes
- Installation - Detailed installation instructions
Core Features
- Databricks Asset Bundle Support - Auto-detection and path resolution
- Serverless Environments - Inline dependencies for serverless 🚀
- Installing Libraries - PyPI, wheels, and Git repos
- Pytest-Style Discovery - Automatic test discovery 🔍
- Notebook Usage Guide - Run tests in notebooks 📘
- Workspace Tests - Run tests from workspace 🔄
Advanced Topics
- Multiple Test Classes - Multiple test classes per notebook
- Parallel Execution - Faster test runs
- Testing Application Code - Test
src/fromtests/📦 - Cluster Configuration - Serverless vs cluster options
Integration
- Databricks CLI Authentication - Authentication setup
- CI/CD Integration - GitHub Actions, Azure DevOps, etc.
Examples
- Testing src/ Code Example - Real workspace pattern
Architecture
src/dbx_test/
├── cli.py # CLI entry point
├── config.py # Configuration management
├── runner_remote.py # Remote Databricks execution (serverless/cluster)
├── notebook_runner.py # Notebook test execution
├── testing.py # Test fixture base class
├── reporting.py # Report generation
├── artifacts.py # Artifact management
├── bundle.py # Databricks Asset Bundle integration
└── utils/ # Utility functions
├── databricks.py # Databricks API helpers (inline environments)
├── notebook.py # Notebook parsing
└── validation.py # Validation utilities
Why This Framework?
✅ Serverless-First Design
- Automatic inline environment creation
- No cluster management overhead
- Fast startup times
- Cost-effective pay-per-use model
✅ Databricks Asset Bundle Native
- Auto-detects bundle projects
- Resolves workspace paths automatically
- Seamless integration with
databricks bundle deploy - No manual path configuration
✅ Developer-Friendly
- Pytest-style test discovery
- Simple test patterns (setup → test → cleanup)
- Rich CLI output
- Works in notebooks and CI/CD
✅ Production-Ready
- JUnit XML for CI/CD integration
- Parallel execution support
- Comprehensive error reporting
- Battle-tested on real projects
License
MIT License - see LICENSE file for details.
Contributing
Contributions welcome! Please see CONTRIBUTING.md.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dbx_test-0.1.3.tar.gz.
File metadata
- Download URL: dbx_test-0.1.3.tar.gz
- Upload date:
- Size: 131.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6bd52ba076327e470955e8abd6dd4a1c77d8be168e8e5511babf8d1694da4e0c
|
|
| MD5 |
1f2838634e04652957f69d3324b241da
|
|
| BLAKE2b-256 |
2f5a040ef2859150f6d2803dc05a1b35bc114d06ebd07b77180b716d8ae11030
|
File details
Details for the file dbx_test-0.1.3-py3-none-any.whl.
File metadata
- Download URL: dbx_test-0.1.3-py3-none-any.whl
- Upload date:
- Size: 70.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3728ce0ef29a3f3a36c31b9aeb189d42883bf1c7424faabc1ee15bfcf6df14ea
|
|
| MD5 |
72b6729171aa8c0ab7eed71bad90ddd8
|
|
| BLAKE2b-256 |
37fd50f07ecb593325cc6fdd6fff8ce27318e6bee3d8ef7d9d705eb71ece2939
|