Skip to main content

Track, store and query your PyPI package download statistics

Project description

pkgdb

Track, store, and analyze PyPI package download statistics.

Fetches download stats via the pypistats API, stores historical data in SQLite, and generates HTML reports with charts.

Installation

pip install pkgdb

To build:

Requires Python 3.10+. Uses uv for dependency management.

uv sync

Usage

Configure packages

Create ~/.pkgdb/packages.json to list packages to track:

["my-package", "another-package"]

Or use an object with a published key:

{"published": ["my-package", "another-package"]}

Alternatively, use pkgdb add <package> to add packages individually. By default, packages are verified to exist on PyPI before adding. Use --no-verify to skip this check.

Commands

# Add a package to tracking (verifies it exists on PyPI)
pkgdb add <package-name>

# Add without verification (offline/bulk use)
pkgdb add <package-name> --no-verify

# Remove a package from tracking
pkgdb remove <package-name>

# Show tracked packages
pkgdb packages

# Import packages from a file (JSON or plain text)
pkgdb import packages.json

# Fetch latest stats from PyPI and store in database
# (skips packages already fetched in the last 24 hours)
pkgdb fetch

# Display stats in terminal (includes trend sparklines and growth %)
pkgdb show

# Show historical stats for a specific package
pkgdb history <package-name>

# Show history since a date (absolute or relative)
pkgdb history <package-name> --since 2024-01-01
pkgdb history <package-name> --since 7d   # last 7 days
pkgdb history <package-name> --since 2w   # last 2 weeks
pkgdb history <package-name> --since 1m   # last month (30 days)

# Generate HTML report with charts (opens in browser)
pkgdb report

# Generate detailed HTML report for a single package
pkgdb report <package-name>

# Include environment summary (Python versions, OS) in report
pkgdb report -e

# Export stats in various formats
pkgdb export -f csv      # CSV format (default)
pkgdb export -f json     # JSON format
pkgdb export -f markdown # Markdown table

# Show detailed stats for a package (Python versions, OS breakdown)
pkgdb stats <package-name>

# Show database info (size, record counts, date range)
pkgdb show --info

# Generate SVG badge for a package
pkgdb badge <package-name>

# Badge for monthly downloads
pkgdb badge <package-name> --period month

# Save badge to file
pkgdb badge <package-name> -o badge.svg

# Fetch GitHub repository stats (stars, forks, activity, language)
pkgdb github

# Sort GitHub stats by name or activity instead of stars
pkgdb github fetch --sort name

# Bypass GitHub cache and fetch fresh data
pkgdb github fetch --no-cache

# Show GitHub cache statistics
pkgdb github cache

# Clear expired GitHub cache entries (or --all for everything)
pkgdb github clear

# Fetch stats and generate report in one step
# (skips packages already fetched in the last 24 hours)
pkgdb update

# Fetch stats and generate report with environment summary
pkgdb update -e

# Include GitHub stats in fetch, report, or update
pkgdb fetch --github
pkgdb report --github
pkgdb update --github

# Clean up orphaned stats (for packages no longer tracked)
pkgdb cleanup

# Prune stats older than N days
pkgdb cleanup --days 365

# Sync packages from PyPI user account (initial or refresh)
pkgdb sync --user <pypi-username>

# Sync and remove packages no longer in user's PyPI account
pkgdb sync --user <pypi-username> --prune

# Show version
pkgdb version

Options

# Use custom database file
pkgdb -d custom.db fetch

# Verbose output (show debug messages)
pkgdb -v fetch

# Quiet mode (only show warnings/errors)
pkgdb -q fetch

# Specify output file for report
pkgdb report -o custom-report.html

# Generate report without opening browser (useful for automation)
pkgdb report --no-browser

# Limit history output to N days
pkgdb history my-package -n 14

# Show history since a specific date (or relative: 7d, 2w, 1m)
pkgdb history my-package --since 2024-01-01
pkgdb history my-package --since 7d

# Skip package verification on import
pkgdb import packages.json --no-verify

# Limit show output to top N packages
pkgdb show --limit 10

# Sort show output by field (total, month, week, day, growth, name)
pkgdb show --sort-by month

# Output show in JSON format
pkgdb show --json

# Export to file instead of stdout
pkgdb export -f json -o stats.json

Architecture

Modular CLI application with the following commands:

Package management:

  • add: Add a package to tracking
  • remove: Remove a package from tracking
  • packages: Show tracked packages with their added dates
  • import: Import packages from file (JSON or text)
  • sync: Sync packages from a PyPI user account (with optional --prune)

Data operations:

  • fetch: Fetch download stats from PyPI and store in SQLite (with -g for GitHub stats)
  • show: Display stats in terminal with trend sparklines and growth %
  • history: Show historical data for a specific package
  • stats: Show detailed breakdown (Python versions, OS) for a package
  • github: Fetch and display GitHub repository stats (stars, forks, activity, language)
  • export: Export stats in CSV, JSON, or Markdown format

Reporting:

  • report: Generate HTML report with SVG charts. With -e flag, includes Python/OS summary. With -g flag, includes GitHub stats (stars, forks, language, activity) in the table. With package argument, generates detailed single-package report
  • badge: Generate shields.io-style SVG badge for a package
  • update: Run fetch then report in one step (supports -e for environment summary, -g for GitHub stats)

Maintenance:

  • cleanup: Remove orphaned stats and optionally prune old data
  • version: Show pkgdb version

Data flow

packages.json -> pypistats API -> SQLite (pkg.db) -> HTML/terminal output

Database schema

The package_stats table stores:

  • package_name: Package identifier
  • fetch_date: Date stats were fetched (YYYY-MM-DD)
  • last_day, last_week, last_month: Recent download counts
  • total: Total downloads (excluding mirrors)

The python_version_stats and os_stats tables cache environment data:

  • package_name: Package identifier
  • fetch_date: Date stats were fetched (YYYY-MM-DD)
  • category: Python version (e.g. "3.12") or OS name (e.g. "Linux")
  • downloads: Download count for that category

The fetch_attempts table tracks API requests:

  • package_name: Package identifier (primary key)
  • attempt_time: ISO timestamp of last fetch attempt
  • success: Whether the fetch succeeded (1) or failed (0)

The github_cache table caches GitHub API responses:

  • repo_key: Lowercased owner/repo identifier (primary key)
  • data: Full JSON response from the GitHub API
  • fetched_at: When the response was cached
  • expires_at: Cache expiry time (default: 24 hours)

Stats are upserted per package per day. Fetch attempts are tracked to avoid hitting PyPI rate limits - packages are only fetched once per 24-hour period. Environment stats are cached alongside download stats, so reports can be generated offline. GitHub API responses are cached for 24 hours to minimize API calls.

Files

Source modules in src/pkgdb/:

  • __init__.py: Public API and version
  • cli.py: CLI argument parsing and commands
  • service.py: High-level service layer
  • db.py: Database operations and context manager
  • api.py: pypistats API wrapper with parallel fetching
  • reports.py: HTML/SVG report generation
  • github.py: GitHub API client with caching and rate limit handling
  • badges.py: SVG badge generation
  • export.py: CSV/JSON/Markdown export
  • utils.py: Helper functions and validation
  • types.py: TypedDict definitions for type safety
  • logging.py: Logging configuration

Data files (all in ~/.pkgdb/):

  • packages.json: Package list configuration (optional, can use add command instead)
  • pkg.db: SQLite database (auto-created)
  • report.html: Generated HTML report (default output)

GitHub Actions

An example workflow is provided at .github/workflows/fetch-stats.yml.example for automated daily stats fetching. To use it:

  1. Copy to .github/workflows/fetch-stats.yml (remove .example)
  2. Configure your package list or PyPI username
  3. The workflow will fetch stats daily and commit updates to your repo

Development

# Install dev dependencies
uv sync

# Run tests
pytest

# Run tests with verbose output
pytest -v

Dependencies

Runtime:

  • pypistats: PyPI download statistics API client
  • tabulate: Terminal table formatting

Development:

  • pytest: Testing framework

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pkgdb-0.1.9.tar.gz (43.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pkgdb-0.1.9-py3-none-any.whl (44.1 kB view details)

Uploaded Python 3

File details

Details for the file pkgdb-0.1.9.tar.gz.

File metadata

  • Download URL: pkgdb-0.1.9.tar.gz
  • Upload date:
  • Size: 43.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for pkgdb-0.1.9.tar.gz
Algorithm Hash digest
SHA256 8bd796723b33777fade96de5380e7c01ce5dfcdd8828d3bdf8d8fe5709ce3566
MD5 bae64107674483dadc758bbc3044bdca
BLAKE2b-256 5fa818e70ff0f6f920f996ee0a1573755d4e7afe47ceed66743b609c5cee6f56

See more details on using hashes here.

File details

Details for the file pkgdb-0.1.9-py3-none-any.whl.

File metadata

  • Download URL: pkgdb-0.1.9-py3-none-any.whl
  • Upload date:
  • Size: 44.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for pkgdb-0.1.9-py3-none-any.whl
Algorithm Hash digest
SHA256 916aa182dc13b20b2427809e6c2275b54f49f043500cbe86896aec85f16103bb
MD5 a811484ba7550cebb14bd7ef74d898c1
BLAKE2b-256 275ef244860d3e811d5fe157338a77a851f5a7ee093f60de22173f20604552ea

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page