Skip to main content

CLI tool for exploring Apache Iceberg table metadata

Project description

iceberg-meta

CLI and TUI for exploring Apache Iceberg table metadata. Inspect snapshots, schemas, manifests, data files, partition health, and column-level statistics -- without Spark or notebooks.

Install

pip install iceberg-meta

# With the interactive TUI
pip install iceberg-meta[tui]

Try it instantly

No config, no Docker, no credentials:

iceberg-meta demo

Creates a temporary catalog with sample tables, launches the TUI, and cleans up on exit.

Quick Start

Connect to your data:

iceberg-meta init              # interactive config setup
iceberg-meta doctor            # verify config + connectivity
iceberg-meta list-tables       # explore
iceberg-meta tui               # interactive browser

Or spin up a Docker playground:

iceberg-meta quickstart        # starts MinIO, seeds data, configures everything
iceberg-meta quickstart --down # tear down when done

Commands

Command Description
demo Try instantly -- temp local catalog, no setup needed
quickstart Docker playground with MinIO + sample data
init Interactive config setup with catalog presets
doctor Validate config, env vars, and connectivity
list-tables Discover namespaces and tables
summary <table> Row counts, file counts, recent operations
health <table> File sizes, partition skew, column nulls, column bounds
table-info <table> Format version, UUID, schema, partition spec, properties
snapshots <table> Snapshot history (--watch N for live monitoring)
schema <table> Schema tree (--history for evolution with diffs)
manifests <table> Manifest files for current or specified snapshot
files <table> Data files with sizes, row counts, format
partitions <table> Partition statistics
snapshot-detail <table> <id> Deep dive into one snapshot
diff <table> <s1> <s2> What changed between two snapshots
tree <table> Full metadata hierarchy as a tree
tui Interactive terminal UI

Every data command supports --output json and --output csv.

Use Cases

Pre-merge validation in CI/CD

Confirm a pipeline write actually landed before merging:

iceberg-meta summary staging.orders          # row count, file count, latest snapshot
iceberg-meta diff staging.orders $OLD $NEW   # what changed between two snapshots
iceberg-meta files staging.orders -o csv     # pipe file-level stats into a check script

Debugging failing writes

Spark job "succeeded" but downstream dashboards are empty:

iceberg-meta snapshots staging.orders        # did the snapshot actually land?
iceberg-meta schema staging.orders --history # did a schema evolution break compatibility?
iceberg-meta files staging.orders            # are the new data files present and non-empty?

Monitoring table health

Spot small-file problems, partition skew, and compaction needs before they impact query performance:

iceberg-meta health warehouse.events

The health report covers file sizes (min/avg/median/max with small-file warnings), delete file accumulation, partition skew detection, column null rates, column storage distribution, and column value bounds.

Live monitoring

Watch for new snapshots as a pipeline runs:

iceberg-meta snapshots warehouse.events --watch 5

Onboarding and knowledge transfer

New team member needs to understand the data platform:

iceberg-meta tui

Incident response

Production data looks wrong — find when the issue was introduced:

iceberg-meta snapshots prod.customers        # find the suspicious snapshot IDs
iceberg-meta diff prod.customers 111 222     # compare record counts and file changes
iceberg-meta tree prod.customers             # drill into manifests and data files

Scripting and automation

Pipe machine-readable output into alerts or dashboards:

FILE_COUNT=$(iceberg-meta -o json summary db.events | jq '.file_count')
[ "$FILE_COUNT" -gt 1000 ] && echo "Small file problem detected"

iceberg-meta -o csv snapshots db.events > snapshots.csv
iceberg-meta -o json health db.events | jq '.[] | select(.Section == "Column Nulls")'

See it in action

$ iceberg-meta summary sales.orders

┌─────────────────────────────────────────────────────────────────┐
│                     sales.orders  Summary                       │
├──────────────────────┬──────────────────────────────────────────┤
│ Format version       │ 2                                        │
│ Total snapshots      │ 4                                        │
│ Total data files     │ 1                                        │
│ Total records        │ 15                                       │
│ Total size           │ 5.2 KB                                   │
│ Partition spec       │ region (identity)                        │
├──────────────────────┴──────────────────────────────────────────┤
│ Recent Operations                                               │
│  overwrite   2025-02-28 19:04    +15 rows   -60 rows            │
│  append      2025-02-28 19:04    +20 rows   -0 rows             │
│  append      2025-02-28 19:04    +15 rows   -0 rows             │
└─────────────────────────────────────────────────────────────────┘
$ iceberg-meta schema sales.customers --history

Schema 0  (initial)
├── customer_id: long
├── name: string
└── email: string

Schema 1  (+2 fields)
├── phone: string          ← added
└── signup_date: date      ← added

Schema 2  (1 rename)
└── email_address: string  ← renamed from email

TUI Keybindings

Key Action
1-7 Switch tabs (Summary, Snapshots, Schema, Files, Manifests, Health, Tree)
d Diff two snapshots
s Snapshot detail
r Refresh
? Help
q Quit

Configuration

Run iceberg-meta init for interactive setup, or create ~/.iceberg-meta.yaml manually. Credentials use ${VAR} placeholders resolved from the environment -- secrets never touch disk.

See docs/configuration.md for full details, environment variable overrides, and .env file support.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

iceberg_meta-0.2.0.tar.gz (46.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

iceberg_meta-0.2.0-py3-none-any.whl (47.9 kB view details)

Uploaded Python 3

File details

Details for the file iceberg_meta-0.2.0.tar.gz.

File metadata

  • Download URL: iceberg_meta-0.2.0.tar.gz
  • Upload date:
  • Size: 46.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.10.7 {"installer":{"name":"uv","version":"0.10.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for iceberg_meta-0.2.0.tar.gz
Algorithm Hash digest
SHA256 f8b78c4836e300fd6d5f118adda37b9a99b588444140ab7e7fcecdea314590ed
MD5 0c9bf6fa59afd34a6dffd3edf1c4f455
BLAKE2b-256 34cb5340d1267093ab1de1b32571a226f1dea5c361de7b8985da2ba408ebaa29

See more details on using hashes here.

File details

Details for the file iceberg_meta-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: iceberg_meta-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 47.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.10.7 {"installer":{"name":"uv","version":"0.10.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for iceberg_meta-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 bcf8e4236335f88cc07dfa2f0201af98574dd6d327118443efdb08c1707a53fc
MD5 ed8af862de5465d0dca1e4ed4c377d88
BLAKE2b-256 25f23cef7c49f09003e387700f5a15ca5a02f1b2f543e7c00256dc31b1d9b035

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page