Track, store and query your PyPI package download statistics
Project description
pkgdb
Track, store, and analyze PyPI package download statistics.
Fetches download stats via the pypistats API, stores historical data in SQLite, and generates HTML reports with charts.
Installation
pip install pkgdb
To build:
Requires Python 3.10+. Uses uv for dependency management.
uv sync
Usage
Quick start
The fastest way to get started is pkgdb init, which walks you through setup:
pkgdb init
This prompts for your PyPI username (optional), syncs your packages, fetches current stats, and generates an HTML report -- all in one step. For non-interactive use:
pkgdb init --user <pypi-username> --no-browser
Configure packages
Create ~/.pkgdb/packages.json to list packages to track:
["my-package", "another-package"]
Or use an object with a published key:
{"published": ["my-package", "another-package"]}
Alternatively, use pkgdb add <package> to add packages individually. By default, packages are verified to exist on PyPI before adding. Use --no-verify to skip this check.
Configuration file
Create ~/.pkgdb/config.toml to set persistent defaults:
[defaults]
github = true # always include GitHub stats
environment = true # always include environment summary
no_browser = false # don't auto-open reports
sort_by = "total" # default sort order (total, month, week, day, growth, name)
# database = "~/.pkgdb/pkg.db" # custom database path
[report]
# output = "~/.pkgdb/report.html" # custom report path
[init]
# pypi_user = "myusername" # default PyPI username for init command
CLI flags always override config values. The config file is optional -- all settings have sensible defaults.
Commands
# Guided first-run setup (sync packages, fetch stats, generate report)
pkgdb init
# Non-interactive init with PyPI username
pkgdb init --user <pypi-username>
# Add a package to tracking (verifies it exists on PyPI)
pkgdb add <package-name>
# Add without verification (offline/bulk use)
pkgdb add <package-name> --no-verify
# Remove a package from tracking
pkgdb remove <package-name>
# Show tracked packages
pkgdb packages
# Import packages from a file (JSON or plain text)
pkgdb import packages.json
# Fetch latest stats from PyPI and store in database
# (skips packages already fetched in the last 24 hours)
pkgdb fetch
# Display stats in terminal (includes trend sparklines and growth %)
pkgdb show
# Show package history (HTML report with chart + releases, opens in browser)
pkgdb history <package-name>
# Show history as text table in terminal
pkgdb history <package-name> --text
# Filter history by date (works with both HTML and text output)
pkgdb history <package-name> --since 7d # last 7 days
pkgdb history <package-name> --since 2w # last 2 weeks
pkgdb history <package-name> --since 1m # last month (30 days)
pkgdb history <package-name> --since 2024-01-01
# Compare stats between time periods
pkgdb diff # compare to previous fetch
pkgdb diff --period week # this week vs last week
pkgdb diff --period month # this month vs last month
# Show release history for a package (PyPI and GitHub)
pkgdb releases <package-name>
# Show only the most recent 10 releases
pkgdb releases <package-name> --limit 10
# Generate HTML report with charts (opens in browser)
pkgdb report
# Generate detailed HTML report for a single package
pkgdb report <package-name>
# Generate project view with release timeline overlay
pkgdb report <package-name> --project
# Include environment summary (Python versions, OS) in report
pkgdb report -e
# Export stats in various formats
pkgdb export -f csv # CSV format (default)
pkgdb export -f json # JSON format
pkgdb export -f markdown # Markdown table
# Show detailed stats for a package (Python versions, OS breakdown)
pkgdb stats <package-name>
# Show database info (size, record counts, date range)
pkgdb show --info
# Generate SVG badge for a package
pkgdb badge <package-name>
# Badge for monthly downloads
pkgdb badge <package-name> --period month
# Save badge to file
pkgdb badge <package-name> -o badge.svg
# Fetch GitHub repository stats (stars, forks, activity, language)
pkgdb github
# Sort GitHub stats by name or activity instead of stars
pkgdb github fetch --sort name
# Bypass GitHub cache and fetch fresh data
pkgdb github fetch --no-cache
# Show GitHub cache statistics
pkgdb github cache
# Clear expired GitHub cache entries (or --all for everything)
pkgdb github clear
# Launch interactive web dashboard (opens browser)
pkgdb serve
# Serve on a custom port without opening browser
pkgdb serve --port 3000 --no-browser
# Fetch stats and generate report in one step
# (skips packages already fetched in the last 24 hours)
pkgdb update
# Fetch stats and generate report with environment summary
pkgdb update -e
# Include GitHub stats in fetch, report, or update
pkgdb fetch --github
pkgdb report --github
pkgdb update --github
# Clean up orphaned stats (for packages no longer tracked)
pkgdb cleanup
# Prune stats older than N days
pkgdb cleanup --days 365
# Sync packages from PyPI user account (initial or refresh)
pkgdb sync --user <pypi-username>
# Sync and remove packages no longer in user's PyPI account
pkgdb sync --user <pypi-username> --prune
# Show version
pkgdb version
Options
# Use custom database file
pkgdb -d custom.db fetch
# Verbose output (show debug messages)
pkgdb -v fetch
# Quiet mode (only show warnings/errors)
pkgdb -q fetch
# Specify output file for report
pkgdb report -o custom-report.html
# Generate report without opening browser (useful for automation)
pkgdb report --no-browser
# Limit history to N days
pkgdb history my-package -n 14
# History report to file without opening browser
pkgdb history my-package -o history.html --no-browser
# Show history since a specific date (or relative: 7d, 2w, 1m)
pkgdb history my-package --since 2024-01-01
pkgdb history my-package --since 7d
# Skip package verification on import
pkgdb import packages.json --no-verify
# Limit show output to top N packages
pkgdb show --limit 10
# Sort show output by field (total, month, week, day, growth, name)
pkgdb show --sort-by month
# JSON output (available on show, packages, history, stats, cleanup, github)
pkgdb show --json
pkgdb packages --json
pkgdb history my-package --json
pkgdb stats my-package --json
pkgdb cleanup --json
pkgdb github --json
# Export to file instead of stdout
pkgdb export -f json -o stats.json
Architecture
Modular CLI application with the following commands:
Setup:
- init: Guided first-run setup (sync packages, fetch stats, generate report)
Package management:
- add: Add a package to tracking
- remove: Remove a package from tracking
- packages: Show tracked packages with their added dates
- import: Import packages from file (JSON or text)
- sync: Sync packages from a PyPI user account (with optional
--prune)
Data operations:
- fetch: Fetch download stats from PyPI and store in SQLite (with
-gfor GitHub stats) - show: Display stats in terminal with trend sparklines and growth %
- diff: Compare download stats between time periods (previous fetch, week-over-week, month-over-month)
- history: Show package history as HTML report (default) or text table (
--text) - stats: Show detailed breakdown (Python versions, OS) for a package
- releases: Show release history for a package (PyPI and GitHub)
- github: Fetch and display GitHub repository stats (stars, forks, activity, language)
- export: Export stats in CSV, JSON, or Markdown format
Reporting:
- report: Generate HTML report with SVG charts. With
-eflag, includes Python/OS summary. With-gflag, includes GitHub stats (stars, forks, language, activity) in the table. With package argument, generates detailed single-package report. With-p/--projectflag, generates project view with release timeline overlay - badge: Generate shields.io-style SVG badge for a package
- update: Run fetch then report in one step (supports
-efor environment summary,-gfor GitHub stats) - serve: Launch interactive web dashboard with live data from SQLite. Overview with sortable/filterable stats table, package detail with zoomable charts and release markers, comparison with multi-package overlay
Maintenance:
- cleanup: Remove orphaned stats and optionally prune old data
- version: Show pkgdb version
Data flow
packages.json -> pypistats API -> SQLite (pkg.db) -> HTML/terminal output
Database schema
The package_stats table stores:
package_name: Package identifierfetch_date: Date stats were fetched (YYYY-MM-DD)last_day,last_week,last_month: Recent download countstotal: Total downloads (excluding mirrors)
The python_version_stats and os_stats tables cache environment data:
package_name: Package identifierfetch_date: Date stats were fetched (YYYY-MM-DD)category: Python version (e.g. "3.12") or OS name (e.g. "Linux")downloads: Download count for that category
The fetch_attempts table tracks API requests:
package_name: Package identifier (primary key)attempt_time: ISO timestamp of last fetch attemptsuccess: Whether the fetch succeeded (1) or failed (0)
The github_cache table caches GitHub API responses:
repo_key: Lowercasedowner/repoidentifier (primary key)data: Full JSON response from the GitHub APIfetched_at: When the response was cachedexpires_at: Cache expiry time (default: 24 hours)
The pypi_releases table caches PyPI release history:
package_name: Package identifierversion: Release version stringupload_date: Date the version was uploaded (YYYY-MM-DD)
The github_releases table caches GitHub release history:
repo_key: Lowercasedowner/repoidentifiertag_name: Release tag (e.g. "v0.1.0")published_at: Date the release was published (YYYY-MM-DD)
The release_cache table tracks freshness of release data:
cache_key: Cache identifier (e.g. "pypi:my-package" or "github:owner/repo")fetched_at: When the data was last fetchedexpires_at: Cache expiry time (default: 24 hours)
Stats are upserted per package per day. Fetch attempts are tracked to avoid hitting PyPI rate limits - packages are only fetched once per 24-hour period. Environment stats are cached alongside download stats, so reports can be generated offline. GitHub API responses are cached for 24 hours to minimize API calls. Release data (PyPI and GitHub) is cached for 24 hours.
Files
Source modules in src/pkgdb/:
__init__.py: Public API and versioncli.py: CLI argument parsing and commandsconfig.py: Configuration file loading (~/.pkgdb/config.toml)service.py: High-level service layerdb.py: Database operations and context managerapi.py: pypistats API wrapper with parallel fetchingreports.py: HTML/SVG report generationserver.py: HTTP server for the interactive web dashboarddashboard.py: HTML page templates for the dashboard (overview, detail, comparison)github.py: GitHub API client with caching and rate limit handlingbadges.py: SVG badge generationexport.py: CSV/JSON/Markdown exportutils.py: Helper functions and validationtypes.py: TypedDict definitions for type safetylogging.py: Logging configuration
Data files (all in ~/.pkgdb/):
config.toml: Configuration file for persistent defaults (optional)packages.json: Package list configuration (optional, can useaddcommand instead)pkg.db: SQLite database (auto-created)report.html: Generated HTML report (default output)
GitHub Actions
An example workflow is provided at .github/workflows/fetch-stats.yml.example for automated daily stats fetching. To use it:
- Copy to
.github/workflows/fetch-stats.yml(remove.example) - Configure your package list or PyPI username
- The workflow will fetch stats daily and commit updates to your repo
Documentation
API documentation is built with MkDocs:
# Build docs
make docs
# Serve locally with live reload
make docs-serve
# Deploy to GitHub Pages
make docs-deploy
Then open http://127.0.0.1:8000 to browse the docs locally.
Development
# Install dev dependencies
uv sync
# Run tests
pytest
# Run tests with verbose output
pytest -v
# Full QA (test + lint + typecheck + format)
make qa
Dependencies
Runtime:
pypistats: PyPI download statistics API clienttabulate: Terminal table formatting
Development:
pytest: Testing framework
Documentation:
mkdocs: Static site generatormkdocs-material: Material thememkdocstrings[python]: Auto-generated API docs from docstrings
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pkgdb-0.1.12.tar.gz.
File metadata
- Download URL: pkgdb-0.1.12.tar.gz
- Upload date:
- Size: 92.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
eb99af2de0286fa8a62789e546ac97375b0e3011465df1b00f226c7878e1ea2a
|
|
| MD5 |
cf0be66b44d7a2ad1cab641abedc1800
|
|
| BLAKE2b-256 |
c875fe0b89f97498b5a239e88661ffbb695a07b06bc288374f1edd2b2ab2f3fd
|
File details
Details for the file pkgdb-0.1.12-py3-none-any.whl.
File metadata
- Download URL: pkgdb-0.1.12-py3-none-any.whl
- Upload date:
- Size: 88.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c4753c22e25401e33b6d05079a3651fa8355a54970b331ca779235e396fdfe0d
|
|
| MD5 |
110d63a5326771522a3da1eaa0886d41
|
|
| BLAKE2b-256 |
d0f81299a93bbf4131c624b988a81575da9ae5f55aefee7bae480f921970981c
|