Skip to main content

A CLI tool that helps first-time open source contributors analyze GitHub issues against local repositories.

Project description

OSS Issue Analyzer

A CLI tool that helps first-time open source contributors analyze GitHub issues against their local cloned repositories. It indexes code, estimates difficulty, and helps contributors pick issues they can realistically solve.

Features

  • Local Code Indexing - Parse and index Python, JavaScript, and TypeScript code
  • GitHub Issue Integration - Fetch issues directly from GitHub
  • Difficulty Estimation - Heuristic-based scoring for issue complexity
  • Hybrid Retrieval - Semantic + keyword search against indexed code
  • Contributing Signals - Identifies test files, documentation, and isolated changes

Installation

pip install oss-issue-analyzer

Or install in development mode:

pip install -e .

Usage

1. Index a Repository

cd /path/to/repo
oss-issue-analyzer index .

This creates a .oss-index/ folder in the repository root containing vector embeddings.

2. Analyze an Issue

# Using issue number (run from the cloned repo directory)
oss-issue-analyzer analyze 123

# Using a GitHub URL
oss-issue-analyzer analyze https://github.com/owner/repo/issues/123

The tool automatically detects the GitHub remote from the local git repository.

3. Use Local Issue File

oss-issue-analyzer analyze ./issue.md

Commands

index

Index a local repository for code analysis.

oss-issue-analyzer index <repo_path> [OPTIONS]

Options:
  --embedder  Embedding model (nomic, minilm) [default: minilm]
  --force    Force re-index from scratch

analyze

Analyze a GitHub issue against the indexed codebase.

oss-issue-analyzer analyze <issue_ref> [OPTIONS]

Arguments:
  issue_ref        Issue number, URL, or path to local markdown file

Options:
  --repo           Path to indexed repository
  --db-path        Path to index database
  --embedder       Embedding model [default: minilm]
  --limit         Number of code units to retrieve [default: 10]
  --gh-repo       GitHub repo (owner/repo) - auto-detected if not provided

Output Example

╭─────────────── Issue: Fix tokenizer performance ───────────────╮
│ Difficulty: EASY (conf: 88%)                                   │
│ Relative: Easier than 75%                                      │
│                                                                │
│ Files involved:                                                │
│   → src/tokenizer.py                                           │
│   → tests/test_tokenizer.py                                    │
│                                                                │
│ Suggested approach:                                            │
│   1. Start in src/tokenizer.py -> Tokenizer.encode             │
│   2. Bug is in the batch processing logic                      │
│   3. Test: pytest tests/test_tokenizer.py                      │
│                                                                │
│ Contributor signals:                                           │
│  > Test file exists - changes are verifiable                   │
│  > Has documentation                                           │
│  > Isolated change possible                                    │
└────────────────────────────────────────────────────────────────╯

Configuration

Environment Variables

  • GITHUB_TOKEN - GitHub personal access token for API rate limits
  • HF_TOKEN - Hugging Face token for faster embedding downloads

Data Storage

Index data is stored in .oss-index/ folder in the repository root:

  • index.lance/code_units.lance - Vector embeddings
  • index.lance/repositories.lance - Repository metadata

Development

# Install with dev dependencies
pip install -e ".[dev]"

# Run tests
pytest

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

oss_issue_analyzer-1.0.0.tar.gz (132.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

oss_issue_analyzer-1.0.0-py3-none-any.whl (26.2 kB view details)

Uploaded Python 3

File details

Details for the file oss_issue_analyzer-1.0.0.tar.gz.

File metadata

  • Download URL: oss_issue_analyzer-1.0.0.tar.gz
  • Upload date:
  • Size: 132.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.12

File hashes

Hashes for oss_issue_analyzer-1.0.0.tar.gz
Algorithm Hash digest
SHA256 f2f5f74a7da3a6385f43ea90d523cc7d2322bca736f5dd01b8a8eaaf57c59a77
MD5 bf2e2620a112a7e7857d8969e5ef7988
BLAKE2b-256 d204f3df9cd53e8ca757724cad09e7e91b75f3d5b4a50d511d2681fbf9374560

See more details on using hashes here.

File details

Details for the file oss_issue_analyzer-1.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for oss_issue_analyzer-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 600647565c04ac9683788e071101772c80a29e75080b28c051e7f47d90e32f7e
MD5 368190793cb0b6b058789fde459e3c5d
BLAKE2b-256 c9651fb4c02bbf9387639b6fc4410f2a6b24e8c44740112b6ea7092744939a20

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page