CLI tool for exploring Apache Iceberg table metadata

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

These details have not been verified by PyPI

Project description

iceberg-meta

CLI and TUI for exploring Apache Iceberg table metadata. Lightweight, terminal-native, and scriptable -- inspect snapshots, schemas, manifests, data files, partition health, and column-level statistics without spinning up a Spark shell or writing a notebook.

Why iceberg-meta?

Iceberg tables store rich metadata -- schema evolution history, snapshot lineage, manifest-level statistics, column bounds, and more. But accessing any of it usually means writing PySpark code or digging through Avro files by hand.

iceberg-meta gives you instant access to all of it from the terminal:

One command to see everything: iceberg-meta health sales.orders shows file sizes, small-file warnings, partition skew, column null rates, column storage distribution, and value bounds -- all at once.
Interactive exploration: the TUI lets you browse tables, schemas, and snapshots visually without memorizing flags.
Scriptable output: every command supports --output json and --output csv for CI/CD pipelines and alerting.
Zero infrastructure: works with any catalog pyiceberg supports (SQL, REST, Glue, Hive, Nessie, Hadoop).

Install

pip install iceberg-meta

# With the interactive TUI (optional)
pip install iceberg-meta[tui]

Quick Start

# 1. Configure — picks your catalog type, writes ~/.iceberg-meta.yaml
#    with ${VAR} placeholders (secrets stay in the environment)
iceberg-meta init

# 2. Verify — checks config, env vars, and catalog connectivity
iceberg-meta doctor

# 3. Explore
iceberg-meta list-tables
iceberg-meta summary sales.orders
iceberg-meta health sales.orders
iceberg-meta tree sales.orders

Or launch the interactive TUI to browse everything visually:

iceberg-meta tui

See the quickstart/ folder for a guided walkthrough with Docker.

Commands

Command	Description
`init`	Interactive config setup -- catalog presets, `${VAR}` placeholders, connection test
`doctor`	Validate config file, environment variables, and catalog connectivity
`list-tables`	Discover namespaces and tables
`summary <table>`	Single-screen dashboard: row counts, file counts, recent operations
`health <table>`	Comprehensive health report: file sizes, small-file detection, partition skew, column null rates, column sizes, column bounds
`table-info <table>`	Format version, UUID, location, schema, partition spec, properties
`snapshots <table>`	All snapshots with timestamps, operations, summary (`--watch N`)
`schema <table>`	Current schema as a tree (`--history` for all versions with diffs)
`manifests <table>`	Manifest files for current or specified snapshot
`files <table>`	Data files with sizes, row counts, format
`partitions <table>`	Partition statistics
`snapshot-detail <table> <id>`	Deep dive into one snapshot: manifests + files
`diff <table> <snap1> <snap2>`	What changed between two snapshots
`tree <table>`	Full metadata hierarchy as a tree (`--all-snapshots`)
`tui`	Interactive terminal UI -- browse tables and metadata visually

Use Cases

Pre-merge validation in CI/CD

Before merging a pipeline PR, confirm the write actually produced the expected outcome:

iceberg-meta summary staging.orders          # row count, file count, latest snapshot
iceberg-meta diff staging.orders $OLD $NEW   # what changed between two snapshots
iceberg-meta files staging.orders -o csv     # pipe file-level stats into a check script

Debugging failing writes

A Spark job "succeeded" but downstream dashboards are empty. Quickly narrow the problem:

iceberg-meta snapshots staging.orders        # did the snapshot actually land?
iceberg-meta schema staging.orders --history # did a schema evolution break compatibility?
iceberg-meta files staging.orders            # are the new data files present and non-empty?

Monitoring table health

Spot small-file problems, partition skew, and compaction needs before they impact query performance:

iceberg-meta health warehouse.events         # full health report in one command

The health report includes:

File health: min/avg/median/max sizes, small-file warnings (< 32 MB)
Delete files: data vs delete manifest counts, compaction recommendations
Partition skew: per-partition file counts and row counts with skew detection
Column null rates: percentage of nulls per column, color-coded by severity
Column sizes: storage distribution with bar charts showing which columns are largest
Column bounds: min/max values per column from file-level statistics

Live monitoring

Watch for new snapshots as a pipeline runs:

iceberg-meta snapshots warehouse.events --watch 5  # refresh every 5 seconds

Onboarding and knowledge transfer

A new team member needs to understand the data platform. The TUI lets them browse interactively without memorizing commands:

iceberg-meta tui

Incident response

Production data looks wrong. Compare snapshots to find when the issue was introduced:

iceberg-meta snapshots prod.customers        # find the suspicious snapshot IDs
iceberg-meta diff prod.customers 111 222     # compare record counts and file changes
iceberg-meta tree prod.customers             # drill into manifests and data files

Scripting and automation

Pipe machine-readable output into other tools:

# Alert if file count exceeds threshold
FILE_COUNT=$(iceberg-meta -o json summary db.events | jq '.file_count')
[ "$FILE_COUNT" -gt 1000 ] && echo "Small file problem detected"

# Export snapshot history to CSV for a report
iceberg-meta -o csv snapshots db.events > snapshots.csv

# Health data as JSON for a monitoring dashboard
iceberg-meta -o json health db.events | jq '.[] | select(.Section == "Column Nulls")'

TUI

The interactive TUI (iceberg-meta tui) covers nearly all CLI functionality in a single screen. Press ? inside the TUI for a full keybinding reference.

Key	Tab / Action
`1`	Summary -- table overview, recent operations, file health indicators
`2`	Snapshots -- snapshot history with operations and summary
`3`	Schema -- schema evolution with diffs between versions
`4`	Files -- data files with size distribution stats (min/avg/median/max)
`5`	Manifests -- manifest files for current snapshot
`6`	Health -- file sizes, partition skew, column nulls, column sizes, column bounds
`7`	Tree -- full metadata hierarchy (snapshot > manifest list > manifests > files)
`d`	Diff -- compare two snapshots (modal)
`s`	Detail -- snapshot deep-dive (modal)
`r`	Refresh all panels
`?`	Help screen with all keybindings and CLI equivalents
`q`	Quit

The sidebar always shows the namespace/table tree (equivalent to list-tables).

CLI-only features: init (interactive config setup), snapshots --watch N (live-watch mode), table-info (UUID, properties, partition spec), partitions (basic table view), --output json|csv (machine-readable output).

Configuration

Config file with ${VAR} placeholders

Create ~/.iceberg-meta.yaml. Values wrapped in ${VAR} are resolved from the environment at runtime -- never hard-code credentials:

default_catalog: production

catalogs:
  production:
    type: glue
    warehouse: ${ICEBERG_WAREHOUSE}
    s3.region: ${AWS_REGION}

  staging:
    type: sql
    uri: ${ICEBERG_CATALOG_URI}
    warehouse: ${ICEBERG_WAREHOUSE}
    s3.endpoint: ${S3_ENDPOINT}
    s3.access-key-id: ${AWS_ACCESS_KEY_ID}
    s3.secret-access-key: ${AWS_SECRET_ACCESS_KEY}
    s3.region: ${AWS_REGION}

See examples/iceberg-meta.yaml for configs covering Glue, REST, Nessie, Hive, and Hadoop catalogs.

Environment variable overrides

These override any config file value without needing ${VAR} syntax:

Variable	Maps to
`ICEBERG_META_CATALOG_URI`	`uri`
`ICEBERG_META_WAREHOUSE`	`warehouse`
`ICEBERG_META_S3_ENDPOINT`	`s3.endpoint`
`ICEBERG_META_S3_ACCESS_KEY`	`s3.access-key-id`
`ICEBERG_META_S3_SECRET_KEY`	`s3.secret-access-key`
`ICEBERG_META_S3_REGION`	`s3.region`

Interactive setup

iceberg-meta init

Environment variables

iceberg-meta uses python-dotenv to auto-load a .env file from your working directory:

No source or export needed -- just place a standard .env file in your project
Use any variable names you already have -- reference them in your config with ${MY_VAR}
Point to a specific file if your .env is elsewhere: iceberg-meta --env-file path/to/.env
Already-exported shell variables take precedence over .env values (standard dotenv behavior)

Data engineers who already have AWS credentials or catalog URIs in their environment don't need a .env file at all -- the ${VAR} placeholders in the config resolve against whatever is already set.

Global Options

Option	Description
`--catalog, -c`	Catalog name (as defined in config)
`--uri`	Catalog URI override
`--warehouse, -w`	Warehouse path override
`--output, -o`	Output format: `table` (default), `json`, `csv`
`--env-file, -e`	Path to `.env` file (auto-loads `.env` in cwd by default)

Architecture

┌─────────────────────────────────────────────────────────┐
│                    iceberg-meta CLI                      │
│                      (Typer app)                         │
├──────────────┬──────────────────────┬───────────────────┤
│  catalog.py  │    formatters.py     │     utils.py      │
│              │                      │                   │
│  Config      │  Rich Tables/Trees   │  format_bytes()   │
│  resolution  │  for each command    │  format_time()    │
│  + ${VAR}    │                      │  truncate_path()  │
├──────────────┴──────────────────────┴───────────────────┤
│                    pyiceberg                             │
│          (catalog, table, inspect APIs)                  │
├─────────────────────────────────────────────────────────┤
│         Any Iceberg Catalog (SQL, REST, Glue, Hive)     │
├─────────────────────────────────────────────────────────┤
│              S3 / MinIO / HDFS / Local Storage           │
│          (Parquet data + Avro metadata)                  │
└─────────────────────────────────────────────────────────┘

Project Layout

iceberg-meta/
│
├── src/iceberg_meta/      PyPI package source (what gets published)
│   ├── catalog.py         Config resolution + ${VAR} expansion
│   ├── cli.py             Typer commands
│   ├── formatters.py      Rich table / tree renderers + health analysis
│   ├── output.py          JSON, CSV, Rich table output
│   ├── utils.py           Byte / timestamp formatting helpers
│   └── tui/               Interactive terminal UI (optional)
│
├── dev/                   Development & testing
│   ├── .env.example       Environment template (credentials, endpoints)
│   ├── docker-compose.yml MinIO + seed containers
│   ├── docker/            MinIO & seed Dockerfiles
│   ├── tests/             pytest suite (integration + unit)
│   ├── scripts/           Host-side seed script
│   └── DEMO.md            Step-by-step dev walkthrough
│
├── quickstart/            End-user sandbox ("pip install and go")
│   ├── .env.example       Credentials template
│   ├── docker-compose.yml MinIO only (lightweight)
│   ├── iceberg-meta.yaml  Config with ${VAR} placeholders
│   ├── seed.py            Sample data creator
│   └── README.md          Getting-started guide
│
├── examples/              Sample configs for real catalogs
│   └── iceberg-meta.yaml  Glue, REST, Nessie, Hive, Hadoop templates
│
├── pyproject.toml         Package definition
├── Makefile               Dev commands (make test, make lint, ...)
└── LICENSE                MIT license

Development

Requires uv for dependency management.

# First-time setup
make install
make setup          # copies dev/.env.example → .env

# Start infrastructure and seed data
make infra-up
make seed

# Development workflow
make lint         # ruff check
make format       # ruff format
make typecheck    # mypy
make test         # pytest
make test-cov     # pytest with coverage
make all          # lint + format + typecheck + test

# Build & publish
make build        # build sdist + wheel
make clean        # remove build artifacts
make infra-down   # stop & remove containers

See dev/README.md for the full contributor guide and dev/DEMO.md for a step-by-step walkthrough.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

Mandla

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.2.5

Mar 1, 2026

0.2.4

Mar 1, 2026

0.2.3

Feb 28, 2026

0.2.2

Feb 28, 2026

0.2.1

Feb 28, 2026

0.2.0

Feb 28, 2026

This version

0.1.1

Feb 28, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

iceberg_meta-0.1.1.tar.gz (46.7 kB view details)

Uploaded Feb 28, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

iceberg_meta-0.1.1-py3-none-any.whl (45.4 kB view details)

Uploaded Feb 28, 2026 Python 3

File details

Details for the file iceberg_meta-0.1.1.tar.gz.

File metadata

Download URL: iceberg_meta-0.1.1.tar.gz
Upload date: Feb 28, 2026
Size: 46.7 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: uv/0.10.7 {"installer":{"name":"uv","version":"0.10.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for iceberg_meta-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`b4157d72580858f809874dd1e83885cb8bf64153327c8a026297452718f19dbb`
MD5	`097ce1daeeeb9f878e47e631a1be232c`
BLAKE2b-256	`c524f185bafe546c3c6bd3df213e40060b088c467cc048f909c36a2def7bc8eb`

See more details on using hashes here.

File details

Details for the file iceberg_meta-0.1.1-py3-none-any.whl.

File metadata

Download URL: iceberg_meta-0.1.1-py3-none-any.whl
Upload date: Feb 28, 2026
Size: 45.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: uv/0.10.7 {"installer":{"name":"uv","version":"0.10.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for iceberg_meta-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`3c035f5e8414c16ef9abdf39f2bbbffe5e2981a2a4a1cad61082255a6f60705e`
MD5	`4bbc5094b8265c549f91cbaf6f3624c4`
BLAKE2b-256	`790ef981ba58e4e60c1641378b56400a8aa3f0718aa3eed52121c6b515049947`

See more details on using hashes here.

iceberg-meta 0.1.1

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

iceberg-meta

Why iceberg-meta?

Install

Quick Start

Commands

Use Cases

Pre-merge validation in CI/CD

Debugging failing writes

Monitoring table health

Live monitoring

Onboarding and knowledge transfer

Incident response

Scripting and automation

TUI

Configuration

Config file with ${VAR} placeholders

Environment variable overrides

Interactive setup

Environment variables

Global Options

Architecture

Project Layout

Development

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes