Fetch and display Databricks job logs from Unity Catalog Volumes

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

alonisser

These details have not been verified by PyPI

Project description

dbr-logs

Fetch and display Databricks job logs from Unity Catalog Volumes.

Merges driver and executor logs chronologically with source labels, so you can pipe them to grep, jq, or feed them to an LLM.

Why

Debugging a failed Databricks job means navigating a deeply nested, inconsistently structured log directory tree:

dbfs:/Volumes/catalog/schema/logs/prod/my-spark-job/0311-170011-t5450avl/
├── driver/
│   ├── stderr
│   ├── stderr--2026-03-11--18-00        # rotated, plain text
│   ├── stderr--2026-03-11--19-00        # rotated, plain text
│   ├── stdout
│   ├── log4j-active.log
│   └── log4j-2026-03-11-17.log.gz      # rotated, gzipped
├── executor/
│   └── app-20260311170849-0000/         # opaque app ID
│       ├── 0/
│       │   ├── stderr
│       │   ├── stderr--2026-03-11--18.gz   # rotated, gzipped
│       │   └── stdout
│       ├── 1/
│       ├── 2/
│       ...
│       └── 8/
└── eventlog/

The manual process to find what went wrong:

Find the run — run IDs are opaque strings like 0311-170011-t5450avl, not human-readable
Navigate the tree — driver logs, executor logs, or both? Which of the 9 executors had the error?
Handle mixed compression — driver rotated files are plain text, executor rotated files are .gz. You need to databricks fs cp + gunzip to read them
Concatenate rotated files — a single stream is split across the active file and multiple rotated files that must be read in chronological order
Repeat across executors — for a job with 9 executors, that's potentially 9 x 4 files to check
Cross-reference timestamps — the root cause is often in one source, but the symptoms appear in another

For background on how Python logging works in Databricks and why it ends up in this structure, see Everything You Wanted to Know About Python Logging in Databricks.

There are heavier alternatives — Databricks' own Practitioner's Ultimate Guide to Scalable Logging describes a full logging pipeline, and you could also route Databricks logs to Datadog or similar observability platforms. But these solutions carry significant ongoing costs and infrastructure overhead for something most teams only need occasionally when debugging a failed job. dbr-logs is a zero-cost, zero-infrastructure alternative: install a CLI tool, run one command, get your answer.

dbr-logs replaces the manual process with a single command. It discovers the log structure, downloads and decompresses all files, merges everything chronologically with source labels, and lets you filter by level, source, or regex.

Prerequisites

Python 3.11+ (tested on 3.11, 3.12, 3.13, 3.14)
Databricks CLI configured with at least one profile in ~/.databrickscfg (setup guide)
Unity Catalog Volumes log destination configured on your Databricks jobs (cluster_log_conf pointing to a Volumes path)

Installation

# Install as a CLI tool with uv (recommended)
uv tool install dbr-logs

# Or with pipx (isolated environment)
pipx install dbr-logs

# Or with pip (use --user to install globally without affecting your venv)
pip install --user dbr-logs

# Or run directly without installing
uvx dbr-logs <job-name>

Usage

# Fetch logs for the latest run of a job
dbr-logs my-job-name

# Fetch logs from a specific run
dbr-logs my-job-name --run-id 12345

# Use a Databricks workspace URL
dbr-logs "https://dbc-xxx.cloud.databricks.com/jobs/12345/runs/67890?o=123"

# Show only errors
dbr-logs my-job-name --level ERROR

# Focus on application logs (suppress Spark/JVM noise)
dbr-logs my-job-name --focus

# Show only executor logs
dbr-logs my-job-name --source executor

# Show last 50 lines from a specific executor
dbr-logs my-job-name --source executor:3 --tail 50

# JSONL output for piping to jq
dbr-logs my-job-name --format jsonl | jq '.level'

# Logs since last hour
dbr-logs my-job-name --since 1h

Claude Code Integration

dbr-logs includes a Claude Code plugin so you can fetch and analyze logs directly from a Claude conversation.

Install the plugin

# Option 1: Claude Code Plugin
/marketplace add https://github.com/zencity/databricks-logs-reader
/plugin add dbr-logs

# Option 2: skills.sh (works with Claude Code, Cursor, Windsurf, etc.)
npx skills add zencity/databricks-logs-reader

The CLI tool still needs to be installed separately (pip install dbr-logs or uv tool install dbr-logs), but the skill can also use uvx as a zero-install fallback.

Example interaction

You: check the logs for my-spark-job

Claude:
  Runs: dbr-logs my-spark-job --level ERROR,WARN --focus --format jsonl
  Analyzes output, then responds:

  "The latest run (0311-170011-t5450avl) has 3 errors:

  - driver/stderr: TransportChannelHandler connection timeout to 10.177.10.232 after 600s
  - executor/0-7 stderr: HangingTaskDetector reports tasks 140953, 140957 stuck for 10+ min

  Root cause: Shuffle read stall — executors are hanging on remote block fetches.

  Suggested actions:
  - Check executor at 10.177.10.232 for resource pressure
  - Enable spark.shuffle.io.retryWait / spark.shuffle.io.maxRetries
  - Review shuffle partitions to reduce per-task data volume

Options

Option	Short	Description
`--run-id`	`-r`	Databricks run ID (numeric). Defaults to latest run.
`--env`	`-e`	Environment: prod, staging, ondemand. Default: prod.
`--dbr-profile`	`-p`	Databricks CLI profile name.
`--source`	`-s`	`driver`, `executor`, `executor:N`, or `all` (default).
`--stream`		`stderr`, `stdout`, or `all` (default).
`--level`	`-l`	Exact match, comma-separated: ERROR, WARN, INFO, DEBUG.
`--include-log4j`		Include driver log4j files.
`--include-stacktrace`		Include driver stacktrace files.
`--format`	`-f`	`text` (default) or `jsonl`.
`--tail`	`-n`	Show only last N lines.
`--since`		Show logs since time (e.g. `1h`, `30m`, ISO datetime).
`--focus`		Suppress Spark/JVM noise (thread dumps, shuffle, task lifecycle).

Configuration

On first run with multiple Databricks profiles, you'll be prompted to select a default. Config is saved to ~/.config/dbr-logs/config.toml.

Releasing

Version is derived from git tags via hatch-vcs — no version string to maintain in source code.

git tag v0.1.0
git push origin v0.1.0

This triggers the CI pipeline which: builds the package -> creates a GitHub Release with auto-generated notes -> publishes to PyPI.

Limitations

Only Unity Catalog Volumes log destinations are supported. S3 destinations are not yet supported.
Jobs must have cluster_log_conf configured.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

alonisser

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.3

Apr 2, 2026

0.1.2

Mar 14, 2026

0.1.1

Mar 14, 2026

0.1.0

Mar 14, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dbr_logs-0.1.3.tar.gz (84.8 kB view details)

Uploaded Apr 2, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

dbr_logs-0.1.3-py3-none-any.whl (20.1 kB view details)

Uploaded Apr 2, 2026 Python 3

File details

Details for the file dbr_logs-0.1.3.tar.gz.

File metadata

Download URL: dbr_logs-0.1.3.tar.gz
Upload date: Apr 2, 2026
Size: 84.8 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for dbr_logs-0.1.3.tar.gz
Algorithm	Hash digest
SHA256	`78e3696180bd63dcf7eb92a683daf5310e4d0390f8490f0b955b9141b96fb610`
MD5	`248c6eb2d653c8e62b7b14e2682dd084`
BLAKE2b-256	`2b4eb402fc1a8f742bd0ad9118c4eaa8625475adfd66d47ffc3510ad278feb3e`

See more details on using hashes here.

Provenance

The following attestation bundles were made for dbr_logs-0.1.3.tar.gz:

Publisher: publish.yml on zencity/databricks-logs-reader

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: dbr_logs-0.1.3.tar.gz
- Subject digest: 78e3696180bd63dcf7eb92a683daf5310e4d0390f8490f0b955b9141b96fb610
- Sigstore transparency entry: 1219161054
- Sigstore integration time: Apr 2, 2026
Source repository:
- Permalink: zencity/databricks-logs-reader@3f2d9c5b83158eb0485a38cc7f83e904c425f2f5
- Branch / Tag: refs/tags/v0.1.3
- Owner: https://github.com/zencity
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@3f2d9c5b83158eb0485a38cc7f83e904c425f2f5
- Trigger Event: push

File details

Details for the file dbr_logs-0.1.3-py3-none-any.whl.

File metadata

Download URL: dbr_logs-0.1.3-py3-none-any.whl
Upload date: Apr 2, 2026
Size: 20.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for dbr_logs-0.1.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`e458e17f3ea862f1f2cff0901d027871d65764fbb2dcea1e6e3d37ee483272be`
MD5	`0aaea21563bebd75e4ae6e2bb6e4b6ae`
BLAKE2b-256	`c0ddd94c22f7eeb57972f8a9b3de921f616488c784306b5431f788476e3f95d5`

See more details on using hashes here.

Provenance

The following attestation bundles were made for dbr_logs-0.1.3-py3-none-any.whl:

Publisher: publish.yml on zencity/databricks-logs-reader

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: dbr_logs-0.1.3-py3-none-any.whl
- Subject digest: e458e17f3ea862f1f2cff0901d027871d65764fbb2dcea1e6e3d37ee483272be
- Sigstore transparency entry: 1219161155
- Sigstore integration time: Apr 2, 2026
Source repository:
- Permalink: zencity/databricks-logs-reader@3f2d9c5b83158eb0485a38cc7f83e904c425f2f5
- Branch / Tag: refs/tags/v0.1.3
- Owner: https://github.com/zencity
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@3f2d9c5b83158eb0485a38cc7f83e904c425f2f5
- Trigger Event: push

dbr-logs 0.1.3

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

dbr-logs

Why

Prerequisites

Installation

Usage

Claude Code Integration

Install the plugin

Example interaction

Options

Configuration

Releasing

Limitations

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance