Evidence-based repository summarizer that generates comprehensive documentation cards from any GitHub project.

These details have not been verified by PyPI

Project links

Project description

RepoCards

Evidence-based repository summarizer that works on any GitHub project.

RepoCards automatically analyzes GitHub repositories and generates comprehensive documentation cards in both Markdown and JSON formats. Perfect for understanding new projects, building developer tools, or creating automated documentation systems.

Quick Start

Installation

pip install repocards

Basic Usage

Command Line

Analyze any GitHub repository:

repocards summarize https://github.com/owner/repo

Save outputs to files:

repocards summarize https://github.com/owner/repo --out-dir _out
# Creates: _out/card.md and _out/card.json

Customize output filenames:

repocards summarize https://github.com/owner/repo --out-dir _out --out-stem myproject
# Creates: _out/myproject.md and _out/myproject.json

Python API

Use programmatically in your code:

import repocards

# Get markdown string
markdown = repocards.get_repo_info("https://github.com/owner/repo")

# Get pydantic object
card = repocards.get_repo_info("https://github.com/owner/repo", mode="pydantic")
print(card.title)

# Save to files
path = repocards.get_repo_info(
    "https://github.com/owner/repo",
    mode="markdown_file",
    out_dir="./output"
)

Command Options

--out-dir PATH – Target directory for output files (auto-creates if needed)
--out-stem NAME – Base filename without extension (e.g., myproject → myproject.md + myproject.json)
--out-md PATH – Exact path for Markdown output
--out-json PATH – Exact path for JSON output
--max-files N – Maximum number of files to fetch (default: 160)

What RepoCards Extracts

RepoCards analyzes your repository and automatically extracts:

📊 Quick Facts

Primary programming languages and their usage
Detected ecosystems (Python, Node.js, CMake, etc.)
License and topics

🔧 Capabilities

Package names extracted from installation commands
Entry points (CLI commands defined in manifests)
API/CLI availability detection
Dockerfile presence and containerization support
OS support inferred from commands (Linux/macOS/Windows)
Model weights and dataset links (for ML projects)

📝 Commands by Category

Auto-discovered shell commands organized by:

Install – Package managers and dependencies
Setup – Environment configuration
Build – Compilation and build steps
Run – Execution commands
Test – Testing frameworks
Lint – Code quality tools

All commands are categorized by OS (Linux/macOS/Windows/Generic) with source attribution.

🚀 Canonical Quickstart

Auto-generated step-by-step quickstart guide per OS, intelligently selecting the most relevant commands from documentation and CI workflows.

🔗 Additional Information

Overview from README
Python API usage examples
Helpful links (documentation, wikis, releases)
Notable files and directories
Imaging-specific signals (for medical/scientific imaging projects)

Output Format

Markdown Card

The generated Markdown file includes:

Repository metadata (license, topics, languages)
Overview extracted from README
Quick facts about languages and ecosystems
Capability facts (packages, entry points, OS support, etc.)
Canonical quickstart commands organized by OS
Python API examples (if found)
Helpful links with source attribution
Notable files and directories

JSON Card Structure

{
  "repo_url": "https://github.com/owner/repo",
  "ref": "main",
  "title": "owner/repo",
  "meta": {
    "license": "MIT",
    "topics": ["python", "data-science"],
    "languages": {"Python": 50000, "JavaScript": 10000}
  },
  "markdown": "...", // Full markdown content
  "extras": {
    "ecosystems": ["python", "node"],
    "capabilities": {
      "entrypoints": ["myapp = mypackage.cli:main"],
      "provides_api": true,
      "provides_cli": true,
      "dockerfile_present": true,
      "package_names": ["numpy", "pandas"],
      "os_support": ["linux", "macos"],
      "model_weight_links": ["https://huggingface.co/..."],
      "dataset_links": ["https://zenodo.org/..."],
      "buckets_by_os": {
        "install": {
          "linux": [{"cmd": "apt install...", "source": ".github/..."}],
          "macos": [...],
          "windows": [...],
          "generic": [...]
        },
        "build": {...},
        "run": {...},
        "test": {...},
        "lint": {...}
      }
    },
    "quickstart": {
      "linux": [{"cmd": "pip install .", "source": "README.md"}],
      "macos": [...],
      "windows": [...],
      "generic": [...]
    },
    "imaging": {
      "imaging_score": 0.80,
      "python_libs": ["pydicom", "nibabel"],
      "file_types": [".dcm", ".nii"],
      "tasks": ["segmentation", "registration"],
      "modalities": ["CT", "MRI"]
    }
  }
}

Note: All commands and links include provenance (source file path) for transparency.

Programmatic Usage

Simple API

RepoCards provides a simple API for programmatic access:

import repocards

# Get markdown string (default) - token auto-loaded from .env
markdown = repocards.get_repo_info("https://github.com/owner/repo")
print(markdown)

# Get JSON string
json_str = repocards.get_repo_info("https://github.com/owner/repo", mode="json")

# Get pydantic object for structured access
card = repocards.get_repo_info("https://github.com/owner/repo", mode="pydantic")
print(card.title)
print(card.meta["license"])
print(card.extras["ecosystems"])

# Write to markdown file
path = repocards.get_repo_info(
    "https://github.com/owner/repo",
    mode="markdown_file",
    out_dir="./output"
)
print(f"Wrote to: {path}")

# Write to JSON file
path = repocards.get_repo_info(
    "https://github.com/owner/repo",
    mode="json_file",
    out_dir="./output"
)
print(f"Wrote to: {path}")

# Control file fetching limit
card = repocards.get_repo_info(
    "https://github.com/owner/repo",
    mode="pydantic",
    max_files=100
)

Available Modes

Mode	Returns	Description
`"markdown"`	`str`	Markdown content (default)
`"json"`	`str`	JSON string
`"pydantic"`	`RepoCard`	Pydantic model object
`"markdown_file"`	`str`	Writes file, returns path
`"json_file"`	`str`	Writes file, returns path

GitHub Authentication

Authentication is automatic! Just create a .env file in your project root:

# .env file
GITHUB_TOKEN=ghp_your_token_here

The token is automatically loaded from .env or environment variables.

Rate Limits:

Without token: 60 requests/hour
With token: 5,000 requests/hour

Get a GitHub token:

Go to https://github.com/settings/tokens
Generate a new token (classic) with repo scope
Add it to your .env file

Alternative: Export as environment variable

export GITHUB_TOKEN="ghp_your_token_here"

How It Works

Intelligent File Selection

RepoCards fetches a curated subset of repository files:

Documentation (README, docs/, etc.)
Package manifests (pyproject.toml, package.json, CMakeLists.txt, etc.)
CI workflows (.github/workflows/)
Example scripts and demos
Docker configurations

This selective approach keeps analysis fast while gathering comprehensive information.

Command Harvesting

Commands are extracted from:

Fenced shell blocks in Markdown (bash, sh, etc.)
Shell prompts ($-prefixed lines in documentation)
CI workflows (run: steps in GitHub Actions)

OS Classification

Commands are automatically classified by operating system:

Linux: apt, dnf, pacman package managers
macOS: brew, CMake OSX flags
Windows: choco, winget, msbuild, PowerShell
Generic: Cross-platform commands

Package Name Extraction

Intelligently parses installation commands to extract package names:

Filters out -r requirements.txt and similar flags
Removes URLs, local paths, and version specifiers
Strips extras (e.g., package[dev] → package)

Python Code Detection

Extracts Python API examples from fenced code blocks:

Validates code contains real Python (imports/definitions/calls)
Limits to relevant, instructive snippets
Filters out empty or trivial examples

Domain-Specific Analysis

Imaging Analyzer (optional, gated by relevance):

Detects medical/scientific imaging projects
Identifies Python libraries (pydicom, nibabel, SimpleITK, etc.)
Recognizes file formats (.dcm, .nii, .mha, etc.)
Classifies tasks (segmentation, registration, etc.)
Identifies modalities (MRI, CT, PET, etc.)

Development Setup

Clone and install in editable mode:

git clone https://github.com/qchapp/repocards
cd repocards
pip install -e .

Run tests:

pytest tests/

Design Philosophy

General-Purpose

Works on any GitHub repository without per-project configuration or YAML rules.

Evidence-Based

Every extracted command and fact includes source file attribution. No invented or assumed information.

Agent-Ready

Structured JSON output with machine-readable facts enables:

Automated documentation systems
Developer tools and IDE integrations
AI agents that need to understand codebases
Repository analysis pipelines

Reliable

Verbatim commands from actual documentation
No hallucination or inference beyond what's in the repository
Clear provenance for all extracted information

Use Cases

📚 Documentation Generation: Automatically create comprehensive repo cards
🤖 AI/Agent Tools: Provide structured repository information to AI systems
🔍 Code Discovery: Quickly understand unfamiliar projects
📊 Repository Analysis: Batch analyze multiple repositories
🛠️ Developer Tooling: Build IDE extensions or CLI tools that need repo metadata
🏥 Domain Analysis: Identify imaging, ML, or other domain-specific projects

License

MIT

Contributing

Contributions welcome! Please feel free to submit issues or pull requests.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.1.3

Dec 22, 2025

0.1.2

Nov 17, 2025

0.1.1

Nov 17, 2025

This version

0.1.0

Nov 17, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

repocards-0.1.0.tar.gz (32.8 kB view details)

Uploaded Nov 17, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

repocards-0.1.0-py3-none-any.whl (24.8 kB view details)

Uploaded Nov 17, 2025 Python 3

File details

Details for the file repocards-0.1.0.tar.gz.

File metadata

Download URL: repocards-0.1.0.tar.gz
Upload date: Nov 17, 2025
Size: 32.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.0

File hashes

Hashes for repocards-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`67ccf504f39d99345d0e898e1aed6a7f27505204c9d2bdbfa894b6d9c05dc659`
MD5	`59eb1122bdcbfb9719e042708a3e4750`
BLAKE2b-256	`f835015ec1ac654f3d40b84d859a9e3a67aa7a2a9f2e69be93c27c4167a65634`

See more details on using hashes here.

File details

Details for the file repocards-0.1.0-py3-none-any.whl.

File metadata

Download URL: repocards-0.1.0-py3-none-any.whl
Upload date: Nov 17, 2025
Size: 24.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.0

File hashes

Hashes for repocards-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`5758ffcac38111a8b01f49f3f27d2940200a006113e865f978fd66b9de3a7eef`
MD5	`663aed4516a7fda4b96fb6d13610ebe7`
BLAKE2b-256	`ffb10b3a7d6624ff5a543c1937c9d6ec4867a2bc132de294c37cef5888c87224`

See more details on using hashes here.

repocards 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

RepoCards

Quick Start

Installation

Basic Usage

Command Line

Python API

Command Options

What RepoCards Extracts

📊 Quick Facts

🔧 Capabilities

📝 Commands by Category

🚀 Canonical Quickstart

🔗 Additional Information

Output Format

Markdown Card

JSON Card Structure

Programmatic Usage

Simple API

Available Modes

GitHub Authentication

How It Works

Intelligent File Selection

Command Harvesting

OS Classification

Package Name Extraction

Python Code Detection

Domain-Specific Analysis

Development Setup

Design Philosophy

General-Purpose

Evidence-Based

Agent-Ready

Reliable

Use Cases

License

Contributing

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes