Skip to main content

Analyze GitHub repository dependents, ranked by stars or trust.

Project description

dep-rank

Rank GitHub dependents by stars or trust.

PyPI License: MIT Python 3.11+

dep-rank finds the most popular repositories that depend on a given GitHub project. It scrapes GitHub's dependents page, enriches results via the GraphQL API, and works as a command-line tool.

Quick Start

pip install dep-rank
dep-rank deps https://github.com/django/django

CLI Reference

dep-rank deps — List top dependents

dep-rank deps https://github.com/django/django
dep-rank deps https://github.com/django/django --rows 20 --min-stars 100
dep-rank deps https://github.com/django/django --descriptions --format json
dep-rank deps https://github.com/django/django --packages
Option Default Description
--rows 10 Number of results
--min-stars 5 Minimum star count filter
--format table Output format: table or json
--descriptions off Fetch descriptions via GitHub API (requires token)
--packages off Search packages instead of repositories
--token DEP_RANK_TOKEN GitHub token
--max-pages 200 Maximum pages to scrape (ceiling 1000)
--concurrency 3 Max concurrent page fetches (1–10)
--no-adaptive-stop off Disable adaptive early-stop; scrape continues until exhaustion or --max-pages
--rank-by stars Ranking strategy: stars or trust (heuristic, requires token)

dep-rank search — Search code in dependents

dep-rank search https://github.com/django/django "from django.db import"
dep-rank search https://github.com/django/django "middleware" --max-repos 20
Option Default Description
--max-repos 10 Maximum repos to search
--min-stars 50 Only search repos with this many stars
--token DEP_RANK_TOKEN GitHub token (required)
--max-pages 200 Maximum pages to scrape (ceiling 1000)
--concurrency 3 Max concurrent page fetches (1–10)

search always runs a bounded non-adaptive top-K scrape (--no-adaptive-stop is not exposed; adaptive early-stop is permanently disabled for this command).

Partial results

A scrape result (deps, and the search pre-pass) reports whether it finished: results include a complete flag and a reason. complete: false means the scrape stopped early — max_pages_reached (raise --max-pages), trend_converged (the adaptive heuristic judged the top-K stable; use --no-adaptive-stop to scrape until exhaustion or --max-pages), network_failure, or rate_limited. total_count/filtered_count are then lower bounds across the pages actually scraped, not population totals.

dep-rank cache — Manage cache

dep-rank cache stats    # Show cache size
dep-rank cache clear    # Clear all cached data

Authentication

Set the DEP_RANK_TOKEN environment variable with a GitHub personal access token:

export DEP_RANK_TOKEN=ghp_your_token_here

A token is effectively required for non-trivial use: unauthenticated GitHub HTML scraping is limited to ~60 requests/hour per IP, so unauthenticated runs are suitable only for small one-off scrapes. Set DEP_RANK_TOKEN to raise the limit.

What works without a token:

  • dep-rank deps — core scraping and star ranking

What requires a token:

  • --descriptions flag — fetches repo descriptions via GitHub GraphQL API
  • --rank-by trust — fetches engagement/recency metadata via GitHub GraphQL API
  • dep-rank search — code search across dependents

Create a token at github.com/settings/tokens with public_repo scope.

How It Works

dep-rank uses a three-stage pipeline:

  1. Scrape — fetches GitHub's /network/dependents HTML pages to discover all dependents and their approximate star counts
  2. Enrich (optional) — one GraphQL batch query fetches accurate star counts and descriptions for the top N results (replaces 100 individual REST API calls)
  3. Present — returns structured results as a Rich table

Responses are cached in a local SQLite database (~/.cache/dep-rank/) with ETag support for conditional requests. Expired pages are served immediately and refreshed in the background (stale-while-revalidate) on authenticated runs.

Trust Ranking

dep-rank deps --rank-by trust re-ranks dependents by a lightweight composite score instead of raw stars. Stars are useful but gameable; trust ranking blends stars with non-star signals — forks, total issues and pull requests, and recency of activity — fetched via low-cost GitHub GraphQL queries (batched at 100 repositories per request, so a larger pool issues more than one).

dep-rank deps https://github.com/django/django --rank-by trust --token ghp_...

Important caveats:

  • The score is a pool-relative ranking signal, not an absolute quality score — it min-max normalizes signals across the scraped candidate set.
  • It re-ranks only the scraped candidate pool (the star-top-N dependents), not every dependent.
  • It is heuristic and does not detect fake stars. It does not fetch stargazer history, GHArchive data, or external fraud datasets.
  • Trust ranking scrapes a larger candidate pool and is therefore deeper and slower than star ranking.
  • --rank-by trust requires a GitHub token; trust scores appear in --format json output under each repo's trust field.

Motivation that stars are gameable comes from StarScout (repo, preprint, ICSE 2026). The low-resource API basis is the GitHub GraphQL rate-limit docs.

Development

# Prerequisites: Python 3.11+, uv
uv sync
uv run pytest
uv run ruff check .
uv run ruff format --check .
uv run mypy dep_rank/

Acknowledgments

dep-rank is a full rewrite of ghtopdep by Andriy Orehov. The original project is licensed under MIT.

License

MIT — see LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dep_rank-0.3.0.tar.gz (183.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dep_rank-0.3.0-py3-none-any.whl (34.8 kB view details)

Uploaded Python 3

File details

Details for the file dep_rank-0.3.0.tar.gz.

File metadata

  • Download URL: dep_rank-0.3.0.tar.gz
  • Upload date:
  • Size: 183.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for dep_rank-0.3.0.tar.gz
Algorithm Hash digest
SHA256 b92eced0eaf357e452f798aa1a5259020f60e4c662ce5ebe159257a8f52789f8
MD5 656c10cbc57ee6fe8c03ab8c4548938c
BLAKE2b-256 406feaebb0cf6ffd3359141c2115978805840189a5c837e729c5951bf84cbf28

See more details on using hashes here.

Provenance

The following attestation bundles were made for dep_rank-0.3.0.tar.gz:

Publisher: release.yml on j7an/dep-rank

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dep_rank-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: dep_rank-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 34.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for dep_rank-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 babbe00afd71844e513e941a5639dcfd7f4d528d5816211a9e6f2dcd8c7833a7
MD5 b90243fffec5f3392813ab9be3c6e898
BLAKE2b-256 b83e1710683680be911718960ac2e7b78ea2a461f0003465c6353af17c52fdc1

See more details on using hashes here.

Provenance

The following attestation bundles were made for dep_rank-0.3.0-py3-none-any.whl:

Publisher: release.yml on j7an/dep-rank

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page