Skip to main content

Discover, rank, and get personalized recommendations for 'good first issue' contributions

Project description

🎯 Good First Issue Finder

Discover, rank, and get personalized recommendations for "good first issue" contributions in Canonical's GitHub repositories.

Features

  • Scrape all open "good first issues" across 76+ Canonical repos (206 issues)
  • Rank using a weighted heuristic (freshness, competition, availability, popularity, activity, linked PRs)
  • Match issues to your developer profile using GPT-5.5 (single API call, ~$0.002)
  • Cache & diff between runs — see what's new since last time
  • Auto-refresh with --watch mode or cron scheduling
  • Beautiful TUI — browse, filter, and match interactively in the terminal

Quick Start

Prerequisites

Install

pip install .

# Or for development (editable install with test dependencies)
pip install -e ".[dev]"

Run

# 1. Scrape and rank all issues
gfi-scrape

# 2. Interactive TUI (browse, filter, match)
export OPENAI_API_KEY='sk-...'
gfi-tui

# 3. Or use the headless matcher
gfi-match

Project Structure

├── pyproject.toml
├── README.md
├── docs/
│   ├── architecture.md        # System design & data flow
│   └── scoring.md             # Ranking heuristic explained
├── src/gfi_scraper/
│   ├── __init__.py
│   ├── scrape_good_first_issues.py   # Scraper + ranker + cache
│   ├── match_issues.py               # LLM-powered matcher
│   └── tui.py                        # Interactive terminal UI
├── tests/
│   └── test_all.py            # 96 unit tests
├── .cache/                    # Run-to-run diff cache (gitignored)
└── good_first_issues.csv      # Latest scraped results

Usage

Scraper

# Basic run
gfi-scrape

# Custom org
gfi-scrape --org ubuntu

# Auto-refresh every 4 hours
gfi-scrape --watch --interval 4

# Generate crontab entry
gfi-scrape --cron

TUI

gfi-tui
Key Action
b Browse all issues (paginated)
n What's new (since last run)
f Filter by keyword
d Detail view of a specific issue
m Match to your profile (LLM)
s Stats overview
q Quit

Matcher (headless)

gfi-match --top 15

How Scoring Works

Each issue is scored 0–100 using a weighted composite:

Signal Weight Logic
Freshness 25% Exponential decay (half-life: 180 days)
Competition 25% Fewer comments = higher score (cap: 10)
Availability 20% No assignees = 100, decays per assignee
Popularity 15% Repo stars, log-scaled
Activity 10% Staleness gate (updated within 1 year?)
PR Status 5% Open PR = competition penalty

Testing

python3 -m pytest tests/ -v

96 tests covering: scoring functions, body extraction, CSV round-trips, caching/diffing, GraphQL parsing, LLM prompt building, TUI helpers, integration, and edge cases.

Cost

  • Scraping: Free (uses gh CLI with your GitHub token)
  • LLM matching: ~$0.002 per run (single GPT-5.5 call, ~10k tokens)

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gfi_scraper-0.1.0.tar.gz (30.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gfi_scraper-0.1.0-py3-none-any.whl (21.3 kB view details)

Uploaded Python 3

File details

Details for the file gfi_scraper-0.1.0.tar.gz.

File metadata

  • Download URL: gfi_scraper-0.1.0.tar.gz
  • Upload date:
  • Size: 30.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for gfi_scraper-0.1.0.tar.gz
Algorithm Hash digest
SHA256 f1051a6f9bd6d1464b2c6d35d4ac766e77ae20c4474e47642f270aae0b396022
MD5 450a1b7c1858f638dba35c9f2adaeb05
BLAKE2b-256 2335410861470c1559990add55797af6df5093b152110507f12ed19c29a12cdd

See more details on using hashes here.

Provenance

The following attestation bundles were made for gfi_scraper-0.1.0.tar.gz:

Publisher: publish.yml on iamsharduld/gfi-scraper

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file gfi_scraper-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: gfi_scraper-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 21.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for gfi_scraper-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 7837c0abeea6d4817a0d7cacb225bbfdf7916da26ad0ae7fc4b1b1811e653b8a
MD5 b78958f71cd34437f9fe326b0c2a3926
BLAKE2b-256 8f8d97cbc473513dedf222435315e3e41155340c921f74fa96fd5778b239652c

See more details on using hashes here.

Provenance

The following attestation bundles were made for gfi_scraper-0.1.0-py3-none-any.whl:

Publisher: publish.yml on iamsharduld/gfi-scraper

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page