A CLI tool that helps first-time open source contributors analyze GitHub issues against local repositories.

Project description

OSS Issue Analyzer

A CLI tool that helps first-time open source contributors analyze GitHub issues against their local cloned repositories. It indexes code plus selected project text assets, estimates difficulty using AI or heuristics, and helps contributors pick issues they can realistically solve.

Features

Mixed Repository Indexing - Parse code and index selected config, workflow, and documentation files
GitHub Issue Integration - Fetch issues directly from GitHub
Bulk Issue Scanning - Quick heuristic scoring (~80% accurate) for ALL issues using parallel processing
AI-Powered Scoring - Supports multiple LLM providers (OpenAI, Anthropic, Google, Azure OpenAI) for intelligent difficulty estimation and suggestions
Heuristic Fallback - Rule-based scoring when AI is unavailable
Hybrid Retrieval - Semantic + keyword search against indexed code
Contributing Signals - Identifies test files, documentation, and isolated changes
Issue Comments Context - Includes GitHub issue comments (prioritized by maintainer input and popularity) to understand expected practices
Smart Caching - Minimizes API calls and costs (98% reduction in AI costs)

Installation

pip install oss-issue-analyzer

Or install in development mode:

pip install -e .

Quick Start

# 1. Index your repository
cd /path/to/repo
oss-issue-analyzer index .

# 2. (Optional) Set up AI provider for smarter analysis
oss-issue-analyzer setup

# 3. Bulk scan issues (FREE - uses quick heuristics)
oss-issue-analyzer list-issues

# 4. Deep analyze selected issue (1 AI call only)
oss-issue-analyzer analyze 123

Usage

1. Index a Repository

cd /path/to/repo
oss-issue-analyzer index .

This creates a .oss-index/ folder in the repository root containing vector embeddings for code and selected project text assets.

Options:

oss-issue-analyzer index <repo_path> [OPTIONS]

Options:
  --embedder    Embedding model (nomic, minilm) [default: minilm]
  --index-mode  Index mode (mixed, code-only) [default: mixed]
  --force        Force re-index from scratch

2. Set Up AI Provider (Optional but Recommended)

Configure an AI provider to get smarter difficulty analysis and suggestions:

# List available providers based on your .env
oss-issue-analyzer setup --list

# Interactive setup
oss-issue-analyzer setup

# Direct setup with provider and API key
oss-issue-analyzer setup --provider openai --api-key sk-... --test

# Clear saved configuration
oss-issue-analyzer setup --clear

Supported Providers:

Provider	Environment Variable	Default Model
OpenAI	`OPENAI_API_KEY`	gpt-4o-mini
Anthropic (Claude)	`ANTHROPIC_API_KEY`	claude-3-haiku-20240307
Google (Gemini)	`GOOGLE_API_KEY`	gemini-1.5-flash
Azure OpenAI	`AZURE_OPENAI_API_KEY` + `AZURE_OPENAI_ENDPOINT`	(deployment name)

3. List and Analyze Issues (Bulk Scan)

Scan ALL open issues with quick heuristic scoring (FREE, ~80% accurate), then deep-analyze only the ones you're interested in:

# Bulk scan (uses quick heuristics, NO AI calls)
oss-issue-analyzer list-issues

# Filter and sort
oss-issue-analyzer list-issues --filter-difficulty easy
oss-issue-analyzer list-issues --sort difficulty
oss-issue-analyzer list-issues --filter-label "good first issue"

# Interactive mode (select and analyze immediately)
oss-issue-analyzer list-issues --interactive

# Deep analysis (1 AI call for selected issue)
oss-issue-analyzer analyze 123

Cost Comparison:

Approach	GitHub API Calls	AI API Calls	Cost
Analyze each issue	50 + comments	50	$$$
Bulk scan + select	1-2 + 1 (selected)	1	$

Options:

oss-issue-analyzer list-issues [OPTIONS]

Options:
  --repo OWNER/REPO       # GitHub repo (auto-detected from git)
  --state open|all|closed  # Filter by state [default: open]
  --sort difficulty|number|created  # Sort results
  --filter-difficulty easy|medium|hard
  --filter-label TEXT      # e.g., "good first issue"
  --limit N                 # Max issues to show [default: 0=all]
  --cache-ttl HOURS        # Cache duration [default: 1]
  --no-cache                # Force re-fetch
  --workers N              # Parallel workers [default: auto]
  --json                   # JSON output
  --interactive            # Select and analyze immediately

Output Example:

╭────── List of Issues (repo: owner/repo, 47 open) ──────╮
│ #    Title                    Difficulty  Conf    Labels          │
│ 123  Fix parser crash         EASY       82%      good-first-issue │
│ 124  Add new feature          HARD       75%      enhancement      │
│ 125  Update README            EASY       90%      docs             │
└───────────────────────────────────────────────────────────────────────╯

Tip: Run 'oss-issue-analyzer analyze <number>' for detailed AI analysis

4. Analyze an Issue

# Using issue number (run from the cloned repo directory)
oss-issue-analyzer analyze 123

# Using a GitHub URL
oss-issue-analyzer analyze https://github.com/owner/repo/issues/123

# Force AI provider
oss-issue-analyzer analyze 123 --ai-provider openai

# Disable AI and use heuristics only
oss-issue-analyzer analyze 123 --no-ai

The tool automatically detects the GitHub remote from the local git repository.

Options:

oss-issue-analyzer analyze <issue_ref> [OPTIONS]

Arguments:
  issue_ref        Issue number, URL, or path to local markdown file

Options:
  --repo           Path to indexed repository
  --db-path        Path to index database
  --embedder       Embedding model [default: minilm]
  --limit           Number of indexed units to retrieve [default: 10]
  --gh-repo         GitHub repo (owner/repo) - auto-detected if not provided
  --ai-provider     AI provider to use (openai, anthropic, google, azure_openai)
  --no-ai          Disable AI scoring, use heuristics only

5. Use Local Issue File

oss-issue-analyzer analyze ./issue.md

The markdown file should start with a # Title heading.

How AI Scoring Works

When an AI provider is configured, the tool:

Fetches GitHub issue comments (up to 7, prioritized by maintainer input and reaction count)
Retrieves relevant code units using hybrid search (semantic + keyword)
Builds a context-rich prompt including:
- Issue title, body, type, and error patterns
- GitHub issue comments with community/maintainer insights
- Retrieved code units with signatures and docstrings
- Heuristic scoring results for reference
Sends to LLM for intelligent analysis
Falls back to heuristics if AI is unavailable

Without AI, the tool uses rule-based heuristics to estimate difficulty based on code complexity, file types, and metadata.

Output Example

AI-Powered Analysis

╭─────────────── Issue: Fix tokenizer performance ────────────────╮
│ Difficulty: EASY (conf: 88%) [AI]                            │
│ Relative: Easier than 75%                                      │
│                                                                │
│ Relevant files:                                                │
│   → src/tokenizer.py                                           │
│   → tests/test_tokenizer.py                                    │
│                                                                │
│ Suggested approach:                                            │
│   1. Start in src/tokenizer.py -> Tokenizer.encode             │
│   2. The batch processing logic needs optimization               │
│   3. Test: pytest tests/test_tokenizer.py                      │
│                                                                │
│ Contributor signals:                                           │
│  > Test file exists - changes are verifiable                   │
│  > Has documentation                                           │
│  > Isolated change possible                                    │
└────────────────────────────────────────────────────────────────╯

Heuristic Analysis (No AI)

╭─────────────── Issue: Fix tokenizer performance ────────────────╮
│ Difficulty: EASY (conf: 88%)                                   │
│ Relative: Easier than 75%                                      │
│                                                                │
│ Relevant files:                                                │
│   → src/tokenizer.py                                           │
│   → tests/test_tokenizer.py                                    │
│                                                                │
│ Suggested approach:                                            │
│   1. Start in src/tokenizer.py -> Tokenizer.encode             │
│   2. Bug is in the batch processing logic                      │
│   3. Test: pytest tests/test_tokenizer.py                      │
│                                                                │
│ Contributor signals:                                           │
│  > Test file exists - changes are verifiable                   │
│  > Has documentation                                           │
│  > Isolated change possible                                    │
└────────────────────────────────────────────────────────────────╯

Configuration

Environment Variables

Create a .env file in your project root (see .env.example for template):

Variable	Description
`GITHUB_TOKEN`	GitHub personal access token for API rate limits
`HF_TOKEN`	Hugging Face token for faster embedding downloads
`OPENAI_API_KEY`	OpenAI API key
`OPENAI_MODEL`	OpenAI model (default: gpt-4o-mini)
`ANTHROPIC_API_KEY`	Anthropic API key
`ANTHROPIC_MODEL`	Anthropic model (default: claude-3-haiku-20240307)
`GOOGLE_API_KEY`	Google Gemini API key
`AZURE_OPENAI_API_KEY`	Azure OpenAI API key
`AZURE_OPENAI_ENDPOINT`	Azure OpenAI endpoint URL
`AZURE_OPENAI_DEPLOYMENT`	Azure OpenAI deployment name
`AI_ENABLED`	Enable/disable AI scoring (true/false)
`AI_TIMEOUT_SECONDS`	AI request timeout (default: 30)

Configuration File

Provider preferences are saved to ~/.config/oss-issue-analyzer/config.json.

Cache Storage

Analysis results are cached in .oss-issue-analyzer-cache/ in the repository root:

issues/ - Issue lists with quick scores (fresh for 1 hour by default)
analysis/ - Full AI analysis for individual issues (cached indefinitely)

Development

# Install with dev dependencies
pip install -e ".[dev]"

# Run all tests
pytest

# Run specific test files
pytest tests/test_quick_scorer.py
pytest tests/test_cache.py
pytest tests/test_bulk_processor.py
pytest tests/test_ai_scorer.py

License

MIT

Project details

Release history Release notifications | RSS feed

1.0.3

May 4, 2026

This version

1.0.2

May 2, 2026

1.0.1

May 2, 2026

1.0.0

Apr 25, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

oss_issue_analyzer-1.0.2.tar.gz (187.0 kB view details)

Uploaded May 2, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

oss_issue_analyzer-1.0.2-py3-none-any.whl (52.0 kB view details)

Uploaded May 2, 2026 Python 3

File details

Details for the file oss_issue_analyzer-1.0.2.tar.gz.

File metadata

Download URL: oss_issue_analyzer-1.0.2.tar.gz
Upload date: May 2, 2026
Size: 187.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for oss_issue_analyzer-1.0.2.tar.gz
Algorithm	Hash digest
SHA256	`3acdf10d346acf7e33cd902337f73967f5ef3656960f1de86bdb8da6b86c06ef`
MD5	`f5b13c57d7db12e167be81d6d54052e0`
BLAKE2b-256	`4e51244779d96cbb0c04ae8011803e1ae0f414cb1aa89d2f54722d569a0be6ab`

See more details on using hashes here.

File details

Details for the file oss_issue_analyzer-1.0.2-py3-none-any.whl.

File metadata

Download URL: oss_issue_analyzer-1.0.2-py3-none-any.whl
Upload date: May 2, 2026
Size: 52.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for oss_issue_analyzer-1.0.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`afdb75205dd0757793c7fb42190f644a343c5e0aaba9511b71de99134697b0b6`
MD5	`b7e78e5094f18fc014dde3014797fc24`
BLAKE2b-256	`9706ff1eaf640584128a5a632ad54d8c605d37a2addc3a824262f8497a77f3c7`

See more details on using hashes here.

oss-issue-analyzer 1.0.2

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

OSS Issue Analyzer

Features

Installation

Quick Start

Usage

1. Index a Repository

2. Set Up AI Provider (Optional but Recommended)

3. List and Analyze Issues (Bulk Scan)

4. Analyze an Issue

5. Use Local Issue File

How AI Scoring Works

Output Example

AI-Powered Analysis

Heuristic Analysis (No AI)

Configuration

Environment Variables

Configuration File

Cache Storage

Development

License

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes