Skip to main content

A collection-aware metadata index for git repositories

Project description

repoindex

PyPI Version Python Support Test Coverage Build Status License

A collection-aware metadata index for git repositories.

repoindex provides a unified view across all your repositories, enabling queries, organization, and integration with LLM tools like Claude Code.

Philosophy

repoindex knows about your repos (metadata, tags, status), while tools like Claude Code work inside them (editing, generating). Together they provide full portfolio awareness.

Claude Code (deep work on ONE repo)
         |
         |  "What else do I have?"
         |  "Which repos need X?"
         v
    repoindex (collection awareness)
         |
         +-- repo://...     -> what exists
         +-- tags://...     -> organization
         +-- stats://...    -> aggregations
         +-- events://...   -> what happened

Core Capabilities

  • Repository Discovery - Find and track repos across directories
  • Tag-Based Organization - Hierarchical tags for categorization
  • Registry Awareness - PyPI, CRAN publication status
  • Event Tracking - New tags, releases, publishes
  • Statistics - Aggregations across the collection
  • Query Language - Filter and search with expressions
  • MCP Server - LLM integration endpoint for Claude Code

Installation

pip install repoindex

Or from source:

git clone https://github.com/queelius/repoindex.git
cd repoindex
make install

Quick Start

# Configure repository directories
repoindex config generate

# List all repositories
repoindex list

# Check repository status
repoindex status -r --pretty

# Tag repositories for organization
repoindex tag add myproject topic:ml work/active

# Query repositories
repoindex query "language == 'Python' and 'ml' in tags"

# Scan for recent events (releases, tags)
repoindex events --since 7d --pretty

# View statistics
repoindex stats --groupby language

Output Format

All commands output JSONL (newline-delimited JSON) by default, making them perfect for Unix pipelines:

# Find repos with uncommitted changes
repoindex status | jq 'select(.status.clean == false)'

# Count repos by language
repoindex list | jq -s 'group_by(.language) | map({lang: .[0].language, count: length})'

# Get all Python repos with ML tags
repoindex query "language == 'Python'" | jq 'select(.tags | contains(["topic:ml"]))'

Use --pretty for human-readable table output.

MCP Server (Claude Code Integration)

repoindex includes an MCP (Model Context Protocol) server for integration with LLM tools:

# Start MCP server
repoindex mcp serve

Resources (read-only data)

  • repo://list - All repositories with basic metadata
  • repo://{name} - Full metadata for one repository
  • repo://{name}/status - Git status for one repository
  • tags://list - All tags
  • tags://tree - Hierarchical tag view
  • stats://summary - Overall statistics
  • events://recent - Recent events

Tools (actions)

  • repoindex_tag(repo, tag) - Add tag to repository
  • repoindex_untag(repo, tag) - Remove tag from repository
  • repoindex_query(expression) - Query repositories
  • repoindex_refresh(repo?) - Refresh metadata
  • repoindex_stats(groupby) - Get statistics

Tag System

Tags provide powerful organization:

# Explicit tags (user-assigned)
repoindex tag add myrepo topic:ml/research work/client/acme

# Implicit tags (auto-generated)
# - repo:name, dir:parent, lang:python, owner:username
# - status:clean/dirty, visibility:public/private
# - stars:10+, stars:100+, stars:1000+

# Query with tags
repoindex list -t "lang:python" -t "topic:ml/*"

# Tag tree view
repoindex tag tree

Query Language

Powerful queries with fuzzy matching:

# Exact match
repoindex query "language == 'Python'"

# Fuzzy match (handles typos)
repoindex query "language ~= 'pyton'"

# Comparisons
repoindex query "stars > 100"

# Boolean combinations
repoindex query "language == 'Python' and 'ml' in tags"

# List membership
repoindex query "'machine-learning' in topics"

Event Scanning

Track activity across your repositories:

# Recent events (default: last 7 days)
repoindex events --pretty

# Events since specific time
repoindex events --since 24h
repoindex events --since 2024-01-15

# Filter by type
repoindex events --type git_tag
repoindex events --type commit

# Continuous monitoring
repoindex events --watch --interval 300

Configuration

Configuration file: ~/.repoindex/config.json

{
  "general": {
    "repository_directories": [
      "~/github",
      "~/projects/*/repos"
    ]
  },
  "github": {
    "token": "ghp_..."
  },
  "repository_tags": {
    "/path/to/repo": ["topic:ml", "work/active"]
  }
}

Environment variables:

  • REPOINDEX_CONFIG - Custom config file path
  • REPOINDEX_GITHUB_TOKEN - GitHub API token
  • REPOINDEX_METADATA_PATH - Custom metadata store path

Architecture

repoindex follows a clean layered architecture:

Commands (CLI)  ->  Services  ->  Domain Objects
     |                |                |
   Parse args    Business logic   Pure data
   Format output No side effects  Immutable
   Handle I/O    Return generators Consistent schema

Layers

  • Domain - Immutable data objects (Repository, Tag, Event)
  • Infrastructure - External system clients (GitClient, GitHubClient, FileStore)
  • Services - Business logic (RepositoryService, TagService, EventService)
  • Commands - Thin CLI wrappers that use services

Development

# Setup
make install

# Run tests
make test

# Run with coverage
pytest --cov=repoindex --cov-report=html

# Build docs
make docs

604 tests, 86% coverage.

License

MIT License - see LICENSE for details.


(c) 2025 Alex Towell

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

repoindex-0.9.0.tar.gz (399.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

repoindex-0.9.0-py3-none-any.whl (243.2 kB view details)

Uploaded Python 3

File details

Details for the file repoindex-0.9.0.tar.gz.

File metadata

  • Download URL: repoindex-0.9.0.tar.gz
  • Upload date:
  • Size: 399.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.3

File hashes

Hashes for repoindex-0.9.0.tar.gz
Algorithm Hash digest
SHA256 3559976c63aacc1e6a3e262cb8c5da1b514806205e813f03df04ae21e88e12da
MD5 a5d9465423fe0cb74393ec3a35b366bc
BLAKE2b-256 a3aea3057d5e394cb1ac3c486f4e28ea1e410920b3489e791ac955904bc7f676

See more details on using hashes here.

File details

Details for the file repoindex-0.9.0-py3-none-any.whl.

File metadata

  • Download URL: repoindex-0.9.0-py3-none-any.whl
  • Upload date:
  • Size: 243.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.3

File hashes

Hashes for repoindex-0.9.0-py3-none-any.whl
Algorithm Hash digest
SHA256 daea2db5e56b88dcfa653d002a98e187fe35770d384fe2606ed7c72fe5f69793
MD5 8f2094d88f443f80ef9266812d52a15c
BLAKE2b-256 a8de1f18df7da845e1a0510f6067ea1c33987498549a714f640e75f7bba63173

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page