Skip to main content

Scan your Elasticsearch/OpenSearch cluster. Find dead indices, wasted storage, and cost savings.

Project description

Prunr

Scan your Elasticsearch / OpenSearch cluster. Find dead indices, wasted storage, and cost savings.

Prunr is a read-only CLI tool that connects to your cluster, analyzes index usage patterns, and tells you exactly what you can safely delete, shrink, or archive — with confidence levels and estimated savings.

Prunr never writes to your cluster. All operations are read-only. It uses _cat/indices, _stats, and _cluster/health — nothing else.


Install

pip install prunr

Or with pipx (recommended):

pipx install prunr

Requires Python 3.11+.

Quick start

# Scan a cluster
prunr scan --host http://localhost:9200

# With authentication
prunr scan --host https://my-cluster:9200 --user admin --password secret

# Save JSON report
prunr scan --host http://localhost:9200 --format json --output report.json

# Try it without a cluster (built-in demo data)
prunr scan --demo

Sample output

╭──────────────────────────────────────────────────────────────────────────────╮
│ PRUNR CLUSTER REPORT                                                         │
│ Cluster: prod-logging-cluster  |  26 indices  |  1024.3 GB  |  12 nodes      │
│ Scanned: 2026-04-13 01:24:38 UTC                                             │
╰──────────────────────────────────────────────────────────────────────────────╯

╭─────────────────────────────── Cost Overview ────────────────────────────────╮
│   ESTIMATED MONTHLY COST:  $102                                              │
│   ESTIMATED ANNUAL WASTE:  $167                                              │
╰──────────────────────────────────────────────────────────────────────────────╯

TOP RECOMMENDATIONS (by annual savings)

  1. REVIEW  api-access-* (multiple patterns)  (90.3 GB)
     Confidence: LOW
     Annual savings: $54
     Reasoning:
       ⚠ Service 'api-access' appears in 2 different index patterns
       ⚠ These may contain overlapping data from multiple pipelines

  2. DELETE  filebeat-2024.01  (26.1 GB)
     Confidence: MEDIUM
     Annual savings: $31
     Reasoning:
       ⚠ 0 queries since cluster restart
       ⚠ 0 indexing activity

  3. DELETE  debug-traces-2024.08.15  (11.5 GB)
     Confidence: HIGH
     Annual savings: $14
     Reasoning:
       ✓ 0 queries since cluster restart
       ✓ 0 indexing activity
       ✓ Index name matches low-value pattern: debug-traces-2024.08.15

A full sample JSON report is in examples/sample-report.json.

What Prunr detects

Analyzer What it finds Action
Dead index Indices with 0 queries and 0 writes since last restart DELETE
Retention Date-based index series where old indices are never queried REDUCE RETENTION
Storage hog Largest indices by size — surfaces the biggest saving opportunities REVIEW
Duplicate Multiple versioned indices (e.g. logs-v1, logs-v2) that may overlap REVIEW
Shard Over-sharded indices (many tiny shards wasting cluster memory) MERGE SHARDS

Each recommendation includes:

  • Confidence level: HIGH, MEDIUM, or LOW
  • Evidence: specific metrics (query count, indexing rate, size, pattern matches)
  • Estimated annual savings in USD (based on --cost-per-gb, default $0.10/GB/mo)

Confidence levels

Level Meaning
HIGH Strong evidence — safe to act on after quick verification
MEDIUM Likely waste — worth investigating
LOW Possible issue — needs human review before acting

CLI reference

prunr scan [OPTIONS]

Options:
  --host TEXT              ES/OpenSearch URL (e.g. https://localhost:9200)
  --api-key TEXT           API key for authentication
  --user TEXT              Username for basic auth
  --password TEXT          Password for basic auth
  --cost-per-gb FLOAT     Cost per GB per month in USD (default: 0.10)
  --format [terminal|json] Output format (default: terminal)
  --output TEXT            Save report to file
  --no-verify-certs       Skip TLS certificate verification
  --demo                  Run with built-in sample data (no cluster needed)

Safety

Prunr is strictly read-only. It will never:

  • Delete, close, or modify any index
  • Change cluster settings
  • Write data to your cluster
  • Require any write permissions

The only APIs it calls are:

  • GET / (version check)
  • GET /_cat/indices
  • GET /_stats
  • GET /_cluster/health

You can safely run it against production clusters.

Limitations

  • Query and indexing counts reset on node restart — Prunr can only see activity since the last restart
  • Cost estimates use a flat $/GB/month rate; real costs depend on instance types, reserved pricing, and I/O
  • Duplicate detection is heuristic (name-pattern based), not content-based
  • Currently supports a single cluster per scan

Development

git clone https://github.com/rickyruima/prunr.git
cd prunr
python -m venv .venv && source .venv/bin/activate
pip install -e ".[dev]"
pytest -v

License

Apache-2.0 — see LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

prunr-0.1.0.tar.gz (30.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

prunr-0.1.0-py3-none-any.whl (34.8 kB view details)

Uploaded Python 3

File details

Details for the file prunr-0.1.0.tar.gz.

File metadata

  • Download URL: prunr-0.1.0.tar.gz
  • Upload date:
  • Size: 30.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for prunr-0.1.0.tar.gz
Algorithm Hash digest
SHA256 83fe9dcec95e3c3bdf72f2612c761d317aca575637f5975a0207146c8ddedf27
MD5 77e89f4d74f46f4aa93f24601989b0bb
BLAKE2b-256 765ffb4744882c48d12f0019b90a111c5bffaf446094a152161bb78e5f13b030

See more details on using hashes here.

File details

Details for the file prunr-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: prunr-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 34.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for prunr-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 341b6bab7e65b15194d93ef749ccf3a208c97e0411631458ef958c10b56e8f7e
MD5 0c92fcbba59f64b9b5edb49139f6af81
BLAKE2b-256 706d4dd881d700447900641a812700a0f259fafa2cae6f59ea25402002a91020

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page