Skip to main content

Read-only, metadata-only cloud storage audit for stale and duplicate objects (AWS S3, GCS, Azure Blob).

Project description

PendragonDI Cloud Audit

Find stale, oversized, or duplicate files in cloud storage — without touching your data.

PendragonDI Cloud Audit is a lightweight command-line tool that helps you uncover hidden cost drivers in your cloud storage buckets:

  • Unused or stale files
  • Duplicate objects
  • Oversized resources
  • Cold data that hasn’t been touched in months

It runs entirely locally using your own credentials. No file content is ever read. No objects are ever modified.


🔍 Why It Exists

Most teams store more than they realize—and rarely clean up:

  • ☁️ Cloud object stores are treated like infinite file systems
  • 💸 Many orgs pay for millions of forgotten or duplicate files
  • 🧱 Native tools are clunky, slow, or deeply integrated with billing

PendragonDI Cloud Audit gives you a fast, metadata-only snapshot of wasteful storage across AWS S3, Google Cloud Storage, and Azure Blob—before it shows up on your invoice.


✅ Features

  • 🔍 Identifies stale files based on last-modified timestamp
  • 🪞 Detects potential duplicates using file size, name, and timestamp
  • 🧾 Outputs clean, readable HTML or CSV reports
  • 💡 Estimates storage cost impact
  • 🧪 Supports limit-based sampling for fast iteration
  • 🔐 Operates with your credentials — no external access required
  • 🔒 Never reads, moves, or deletes content

🛠️ Installation

Core CLI only (no provider):

pip install pendragondi-cloud-audit

With a provider:

pip install pendragondi-cloud-audit[aws]
pip install pendragondi-cloud-audit[gcs]
pip install pendragondi-cloud-audit[azure]
pip install pendragondi-cloud-audit[all]  # for all providers

🚀 Quickstart

1. Run a scan:

pendragondi-cloud-audit scan aws my-bucket --days-stale 90 --output report.html

You can also limit the number of objects scanned:

pendragondi-cloud-audit scan gcs my-bucket --days-stale 60 --limit 10000 --output audit.csv

📄 Example Report

Total Files: 3200 • Stale: 1800 • Duplicates: 400

Results can be opened in any browser or spreadsheet tool.


🧰 Supported Providers

Provider Install Extra Credential Method
AWS S3 aws Boto3 profile / ENV
GCS gcs Application Default / keyfile
Azure azure Connection string or container URL

🔐 Security & Compliance

PendragonDI Cloud Audit was built for zero-risk analysis:

Layer Behavior
Access Uses your own credentials
Data Privacy Never reads file content
Write Behavior Read-only (no writes)
Output Local CSV or HTML report

📜 License

MIT License


🧭 Why PendragonDI?

Cloud billing surprises happen when small inefficiencies scale. PendragonDI Cloud Audit was designed to help teams see storage drift before it gets expensive.

  • No dashboard logins.
  • No waiting on IT.
  • Just insight.

🤝 Contributing

We welcome contributions!

To contribute:

  • Fork this repo and work from main
  • Use type hints and docstrings
  • Submit focused pull requests
  • Report bugs or ideas via Issues

Questions or feedback? Email us: pendragondi@pendragondi.dev


💖 Support the Project

PendragonDI Cloud Audit is free and open-source. If this tool saved you time or money, consider supporting us on GitHub:

Sponsor on GitHub

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pendragondi_cloud_audit-0.2.5.tar.gz (7.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pendragondi_cloud_audit-0.2.5-py3-none-any.whl (10.8 kB view details)

Uploaded Python 3

File details

Details for the file pendragondi_cloud_audit-0.2.5.tar.gz.

File metadata

  • Download URL: pendragondi_cloud_audit-0.2.5.tar.gz
  • Upload date:
  • Size: 7.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.9

File hashes

Hashes for pendragondi_cloud_audit-0.2.5.tar.gz
Algorithm Hash digest
SHA256 cd3d0fb4d70d00dcdc234544e4c560adf24e358c9f0ccf1f2efa112abd2626c4
MD5 4a5817246d275e2ca0daa1c373832227
BLAKE2b-256 03a9d3e7ff9665e8f8bd4899443690c2dfd6437c7bc0d9408c1f2f3db8c7ab5c

See more details on using hashes here.

File details

Details for the file pendragondi_cloud_audit-0.2.5-py3-none-any.whl.

File metadata

File hashes

Hashes for pendragondi_cloud_audit-0.2.5-py3-none-any.whl
Algorithm Hash digest
SHA256 847857228bbf17fc4a318ba020c36b392a069e2c03f4cae1eff5664dd9b50866
MD5 901169bdd96e5385fbe11a1ab4a62b04
BLAKE2b-256 f8edd9d55abfef5d48752488af15b20c5c7c9f61e634b809bdaec89aff8503c1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page