A CLI tool that helps first-time open source contributors analyze issues from GitHub, GitLab, and Bitbucket against local repositories.
Project description
OSS Issue Analyzer
A CLI tool that helps first-time open source contributors analyze issues from GitHub, GitLab, and Bitbucket against their local cloned repositories. It indexes code plus selected project text assets, estimates difficulty using AI or heuristics, and helps contributors pick issues they can realistically solve.
Features
- Multi-Platform Support - Works with GitHub, GitLab, and Bitbucket repositories
- Mixed Repository Indexing - Parse code and index selected config, workflow, and documentation files
- Expanded Language Support - Index Python, JavaScript, TypeScript, Go, Rust, Java, C, and C++
- Issue Integration - Fetch issues directly from GitHub, GitLab, or Bitbucket
- Bulk Issue Scanning - Quick heuristic scoring (~80% accurate) for ALL issues using parallel processing
- AI-Powered Scoring - Supports multiple LLM providers (OpenAI, Anthropic, Google, Azure OpenAI) for intelligent difficulty estimation and suggestions
- Heuristic Fallback - Rule-based scoring when AI is unavailable
- Hybrid Retrieval - Semantic + keyword search against indexed code
- Contributing Signals - Identifies test files, documentation, and isolated changes
- Dependency-Aware Scoring - Parses core dependency manifests and flags dependency-hell risk factors
- Issue Comments Context - Includes issue comments (prioritized by maintainer input and popularity) to understand expected practices
- Smart Caching - Minimizes API calls and costs (98% reduction in AI costs)
Installation
pip install oss-issue-analyzer
Or install in development mode:
pip install -e .
Quick Start
# 1. Index your repository
cd /path/to/repo
oss-issue-analyzer index .
# 2. (Optional) Set up AI provider for smarter analysis
oss-issue-analyzer setup
# 3. Bulk scan issues (FREE - uses quick heuristics)
oss-issue-analyzer list-issues
# 4. Deep analyze selected issue (1 AI call only)
oss-issue-analyzer analyze 123
Usage
1. Index a Repository
cd /path/to/repo
oss-issue-analyzer index .
This creates a .oss-index/ folder in the repository root containing vector embeddings for code and selected project text assets.
Supported code languages: Python, JavaScript, TypeScript, Go, Rust, Java, C, and C++.
When using mixed indexing, the tool also indexes dependency and build manifests such as pyproject.toml, requirements.txt, package.json, Cargo.toml, go.mod, pom.xml, Gradle files, CMakeLists.txt, Conan manifests, and vcpkg.json.
Options:
oss-issue-analyzer index <repo_path> [OPTIONS]
Options:
--embedder Embedding model (nomic, minilm) [default: minilm]
--index-mode Index mode (mixed, code-only) [default: mixed]
--force Force re-index from scratch
2. Set Up AI Provider (Optional but Recommended)
Configure an AI provider to get smarter difficulty analysis and suggestions:
# List available providers based on your .env
oss-issue-analyzer setup --list
# Interactive setup
oss-issue-analyzer setup
# Direct setup with provider and API key
oss-issue-analyzer setup --provider openai --api-key sk-... --test
# Clear saved configuration
oss-issue-analyzer setup --clear
Supported Providers:
| Provider | Environment Variable | Default Model |
|---|---|---|
| OpenAI | OPENAI_API_KEY |
gpt-4o-mini |
| Anthropic (Claude) | ANTHROPIC_API_KEY |
claude-3-haiku-20240307 |
| Google (Gemini) | GOOGLE_API_KEY |
gemini-1.5-flash |
| Azure OpenAI | AZURE_OPENAI_API_KEY + AZURE_OPENAI_ENDPOINT |
(deployment name) |
3. List and Analyze Issues (Bulk Scan)
Scan ALL open issues with quick heuristic scoring (FREE, ~80% accurate), then deep-analyze only the ones you're interested in:
# Bulk scan (uses quick heuristics, NO AI calls)
oss-issue-analyzer list-issues
# Filter and sort
oss-issue-analyzer list-issues --filter-difficulty easy
oss-issue-analyzer list-issues --sort difficulty
oss-issue-analyzer list-issues --filter-label "good first issue"
# Interactive mode (select and analyze immediately)
oss-issue-analyzer list-issues --interactive
# Deep analysis (1 AI call for selected issue)
oss-issue-analyzer analyze 123
# Specify platform explicitly
oss-issue-analyzer list-issues --platform gitlab
oss-issue-analyzer analyze 123 --platform bitbucket
Cost Comparison:
| Approach | Platform API Calls | AI API Calls | Cost |
|---|---|---|---|
| Analyze each issue | 50 + comments | 50 | $$$ |
| Bulk scan + select | 1-2 + 1 (selected) | 1 | $ |
Options:
oss-issue-analyzer list-issues [OPTIONS]
Options:
--repo OWNER/REPO # Repository (auto-detected from git)
--platform github|gitlab|bitbucket # Platform [default: auto-detect]
--state open|all|closed # Filter by state [default: open]
--sort difficulty|number|created # Sort results
--filter-difficulty easy|medium|hard
--filter-label TEXT # e.g., "good first issue"
--limit N # Max issues to show [default: 0=all]
--cache-ttl HOURS # Cache duration [default: 1]
--no-cache # Force re-fetch
--workers N # Parallel workers [default: auto]
--json # JSON output
--interactive # Select and analyze immediately
Output Example:
╭────── List of Issues (repo: owner/repo, 47 open) ──────╮
│ # Title Difficulty Conf Labels │
│ 123 Fix parser crash EASY 82% good-first-issue │
│ 124 Add new feature HARD 75% enhancement │
│ 125 Update README EASY 90% docs │
└───────────────────────────────────────────────────────────────────────╯
Tip: Run 'oss-issue-analyzer analyze <number>' for detailed AI analysis
4. Analyze an Issue
# Using issue number (run from the cloned repo directory)
oss-issue-analyzer analyze 123
# Using platform URLs
oss-issue-analyzer analyze https://github.com/owner/repo/issues/123
oss-issue-analyzer analyze https://gitlab.com/owner/repo/-/issues/123
oss-issue-analyzer analyze https://bitbucket.org/owner/repo/issues/123
# Using platform prefix
oss-issue-analyzer analyze github:owner/repo#123
oss-issue-analyzer analyze gitlab:owner/repo#123
oss-issue-analyzer analyze bitbucket:owner/repo#123
# Force AI provider
oss-issue-analyzer analyze 123 --ai-provider openai
# Disable AI and use heuristics only
oss-issue-analyzer analyze 123 --no-ai
# Specify platform explicitly
oss-issue-analyzer analyze 123 --platform gitlab
The tool automatically detects the platform from the git remote URL.
Options:
oss-issue-analyzer analyze <issue_ref> [OPTIONS]
Arguments:
issue_ref Issue number, URL, or path to local markdown file
Options:
--repo Path to indexed repository
--db-path Path to index database
--embedder Embedding model [default: minilm]
--limit Number of indexed units to retrieve [default: 10]
--gh-repo Repository (owner/repo) - auto-detected if not provided
--platform Platform: github, gitlab, bitbucket [default: auto-detect]
--ai-provider AI provider to use (openai, anthropic, google, azure_openai)
--no-ai Disable AI scoring, use heuristics only
5. Use Local Issue File
oss-issue-analyzer analyze ./issue.md
The markdown file should start with a # Title heading.
How AI Scoring Works
When an AI provider is configured, the tool:
- Fetches GitHub issue comments (up to 7, prioritized by maintainer input and reaction count)
- Retrieves relevant code units using hybrid search (semantic + keyword)
- Builds a context-rich prompt including:
- Issue title, body, type, and error patterns
- GitHub issue comments with community/maintainer insights
- Retrieved code units with signatures and docstrings
- Heuristic scoring results for reference
- Sends to LLM for intelligent analysis
- Falls back to heuristics if AI is unavailable
Without AI, the tool uses rule-based heuristics to estimate difficulty based on code complexity, file types, dependency complexity, and issue metadata.
Output Example
AI-Powered Analysis
╭─────────────── Issue: Fix tokenizer performance ────────────────╮
│ Difficulty: EASY (conf: 88%) [AI] │
│ Relative: Easier than 75% │
│ │
│ Relevant files: │
│ → src/tokenizer.py │
│ → tests/test_tokenizer.py │
│ │
│ Suggested approach: │
│ 1. Start in src/tokenizer.py -> Tokenizer.encode │
│ 2. The batch processing logic needs optimization │
│ 3. Test: pytest tests/test_tokenizer.py │
│ │
│ Contributor signals: │
│ > Test file exists - changes are verifiable │
│ > Has documentation │
│ > Isolated change possible │
└────────────────────────────────────────────────────────────────╯
Heuristic Analysis (No AI)
╭─────────────── Issue: Fix tokenizer performance ────────────────╮
│ Difficulty: EASY (conf: 88%) │
│ Relative: Easier than 75% │
│ │
│ Relevant files: │
│ → src/tokenizer.py │
│ → tests/test_tokenizer.py │
│ │
│ Suggested approach: │
│ 1. Start in src/tokenizer.py -> Tokenizer.encode │
│ 2. Bug is in the batch processing logic │
│ 3. Test: pytest tests/test_tokenizer.py │
│ │
│ Contributor signals: │
│ > Test file exists - changes are verifiable │
│ > Has documentation │
│ > Isolated change possible │
└────────────────────────────────────────────────────────────────╯
Configuration
Environment Variables
Create a .env file in your project root (see .env.example for template):
| Variable | Description |
|---|---|
GITHUB_TOKEN |
GitHub personal access token for API rate limits |
GITLAB_TOKEN |
GitLab personal access token for API access |
BITBUCKET_USERNAME |
Bitbucket username |
BITBUCKET_APP_PASSWORD |
Bitbucket app password for API access |
HF_TOKEN |
Hugging Face token for faster embedding downloads |
OPENAI_API_KEY |
OpenAI API key |
OPENAI_MODEL |
OpenAI model (default: gpt-4o-mini) |
ANTHROPIC_API_KEY |
Anthropic API key |
ANTHROPIC_MODEL |
Anthropic model (default: claude-3-haiku-20240307) |
GOOGLE_API_KEY |
Google Gemini API key |
AZURE_OPENAI_API_KEY |
Azure OpenAI API key |
AZURE_OPENAI_ENDPOINT |
Azure OpenAI endpoint URL |
AZURE_OPENAI_DEPLOYMENT |
Azure OpenAI deployment name |
AI_ENABLED |
Enable/disable AI scoring (true/false) |
AI_TIMEOUT_SECONDS |
AI request timeout (default: 30) |
Configuration File
Provider preferences are saved to ~/.config/oss-issue-analyzer/config.json.
Cache Storage
Analysis results are cached in .oss-issue-analyzer-cache/ in the repository root:
issues/- Issue lists with quick scores (fresh for 1 hour by default)analysis/- Full AI analysis for individual issues (cached indefinitely)
Development
# Install with dev dependencies
pip install -e ".[dev]"
# Run all tests
pytest
# Run specific test files
pytest tests/test_quick_scorer.py
pytest tests/test_cache.py
pytest tests/test_bulk_processor.py
pytest tests/test_ai_scorer.py
License
MIT
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file oss_issue_analyzer-1.0.3.tar.gz.
File metadata
- Download URL: oss_issue_analyzer-1.0.3.tar.gz
- Upload date:
- Size: 234.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cd2511e299d701766e2094997ba89da3b4714b99441115cc3d07ec274d2671c0
|
|
| MD5 |
58053cb1d32f3377ab1a1d429687cb63
|
|
| BLAKE2b-256 |
84a2b18537f4a83934483d706bdedfbfad36c318b2fd6e7a7bc6d55258d7cd58
|
File details
Details for the file oss_issue_analyzer-1.0.3-py3-none-any.whl.
File metadata
- Download URL: oss_issue_analyzer-1.0.3-py3-none-any.whl
- Upload date:
- Size: 68.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
460d2110f32a409344e429e0e4e008220fd03c398eb3252cafbe79e20ccb4a81
|
|
| MD5 |
15a1866594da2e09bc8f2533edbbb873
|
|
| BLAKE2b-256 |
b4ba7030a7f3714fdeb18d7f342a4c3069eae210838a486c45baa35b19b39e05
|