Skip to main content

Discover, rank, and get personalized recommendations for 'good first issue' contributions

Project description

🎯 Good First Issue Finder

Discover, rank, and get personalized recommendations for "good first issue" contributions in Canonical's GitHub repositories.

Features

  • Scrape all open "good first issues" across 76+ Canonical repos (206 issues)
  • Rank using a weighted heuristic (freshness, competition, availability, popularity, activity, linked PRs)
  • Match issues to your developer profile using GPT-5.5 (single API call, ~$0.002)
  • Cache & diff between runs — see what's new since last time
  • Auto-refresh with --watch mode or cron scheduling
  • Beautiful TUI — browse, filter, and match interactively in the terminal

Quick Start

Prerequisites

Install

pip install .

# Or for development (editable install with test dependencies)
pip install -e ".[dev]"

Run

# 1. Scrape and rank all issues
gfi-scrape

# 2. Interactive TUI (browse, filter, match)
export OPENAI_API_KEY='sk-...'
gfi-tui

# 3. Or use the headless matcher
gfi-match

Project Structure

├── pyproject.toml
├── README.md
├── docs/
│   ├── architecture.md        # System design & data flow
│   └── scoring.md             # Ranking heuristic explained
├── src/gfi_scraper/
│   ├── __init__.py
│   ├── scrape_good_first_issues.py   # Scraper + ranker + cache
│   ├── match_issues.py               # LLM-powered matcher
│   └── tui.py                        # Interactive terminal UI
├── tests/
│   └── test_all.py            # 96 unit tests
├── .cache/                    # Run-to-run diff cache (gitignored)
└── good_first_issues.csv      # Latest scraped results

Usage

Scraper

# Basic run
gfi-scrape

# Custom org
gfi-scrape --org ubuntu

# Auto-refresh every 4 hours
gfi-scrape --watch --interval 4

# Generate crontab entry
gfi-scrape --cron

TUI

gfi-tui
Key Action
b Browse all issues (paginated)
n What's new (since last run)
f Filter by keyword
d Detail view of a specific issue
m Match to your profile (LLM)
s Stats overview
q Quit

Matcher (headless)

gfi-match --top 15

How Scoring Works

Each issue is scored 0–100 using a weighted composite:

Signal Weight Logic
Freshness 25% Exponential decay (half-life: 180 days)
Competition 25% Fewer comments = higher score (cap: 10)
Availability 20% No assignees = 100, decays per assignee
Popularity 15% Repo stars, log-scaled
Activity 10% Staleness gate (updated within 1 year?)
PR Status 5% Open PR = competition penalty

Testing

python3 -m pytest tests/ -v

96 tests covering: scoring functions, body extraction, CSV round-trips, caching/diffing, GraphQL parsing, LLM prompt building, TUI helpers, integration, and edge cases.

Cost

  • Scraping: Free (uses gh CLI with your GitHub token)
  • LLM matching: ~$0.002 per run (single GPT-5.5 call, ~10k tokens)

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gfi_scraper-0.1.1.tar.gz (31.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gfi_scraper-0.1.1-py3-none-any.whl (21.4 kB view details)

Uploaded Python 3

File details

Details for the file gfi_scraper-0.1.1.tar.gz.

File metadata

  • Download URL: gfi_scraper-0.1.1.tar.gz
  • Upload date:
  • Size: 31.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for gfi_scraper-0.1.1.tar.gz
Algorithm Hash digest
SHA256 3d0966c4747c0b3504e55de6b3949efb2747661afd684588d8f5a4dc89221c80
MD5 d17a1f1f44f1cfb297f77c0d00cf7dc1
BLAKE2b-256 c57d50808ffde63562ae71e0cdfecdb839b2afb23d93681b9714e493814408ea

See more details on using hashes here.

Provenance

The following attestation bundles were made for gfi_scraper-0.1.1.tar.gz:

Publisher: publish.yml on iamsharduld/gfi-scraper

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file gfi_scraper-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: gfi_scraper-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 21.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for gfi_scraper-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 e38153aeccf85bc66bdabe2da89110f0172d86791440eb4f7b8a814e3948467d
MD5 97b46930b011646e8f6dff06779a29d8
BLAKE2b-256 1d56892c0f4859f055b6f2ced646979dba851d9b415fec6f29da7fb275443b92

See more details on using hashes here.

Provenance

The following attestation bundles were made for gfi_scraper-0.1.1-py3-none-any.whl:

Publisher: publish.yml on iamsharduld/gfi-scraper

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page