Skip to main content

GitHub Repository Catalog: Fetching, indexing, and organizing READMEs.

Project description

hyperdex

[!CAUTION] This project is currently for personal use only. A public-facing version with improved documentation, examples, and user-friendly interfaces is planned for future development. See the Package Transformation Plan for details on how this will evolve.

For now, Just don't use it!

Update GitHub Forks Update GitHub Sources

Your GitHub Repository Catalog: A comprehensive tool for fetching, indexing, and organizing READMEs from GitHub repositories.

Scripts

This repository contains several Python scripts for managing repository information:

README Fetching & Indexing:

  • gh_repo_fetch_index_shane.py: Fetches READMEs from your personal GitHub repositories (public and private, excluding forks) and generates an index file (github-project-readmes-shane/README.md) with repository metadata and statistics.
  • gh_repo_fetch_forks_shane.py: Fetches READMEs from your personal forked GitHub repositories and generates an index file (github-project-forks-shane/README.md) with repository metadata and statistics.
  • gh_repo_fetch_index_cello.py: Fetches READMEs from the CelloCommunications organization repositories authored by you (based on the first commit) and generates a similar index file (github-project-readmes-cello/README.md).

Release Management (for personal repos):

  • gh_repo_release_latest_shane.py: Reads github-project-readmes-shane/repositories.csv, finds the latest semantic version tag for each personal repository, and creates a GitHub Release with auto-generated notes if one doesn't already exist for that tag.
  • gh_repo_release_initial_shane.py: Reads github-project-readmes-shane/repositories.csv and creates an initial v0.1.0 tag and release for any personal repository that currently has no tags.
  • gh_repo_release_latest_cello.py: Fetches repositories authored by you in the CelloCommunications org, finds the latest semantic version tag for each, and creates a GitHub Release with auto-generated notes if one doesn't already exist. Includes a --dry-run option.
  • gh_repo_release_initial_cello.py: Fetches repositories authored by you in the CelloCommunications org and creates an initial v0.1.0 tag and release for any that currently have no tags. Includes a --dry-run option.
  • gh_repo_update_readmes.py: Reads repository names from a CSV file (e.g., github-project-readmes-shane/repositories.csv), finds corresponding local README files, and updates the remote READMEs using the GitHub API.
  • gh_repo_setup_secret.py: Sets up a GitHub Personal Access Token as a repository secret (e.g., GH_PAT for workflows).

Features

  • Fetches READMEs from GitHub repositories
  • Filters by authorship (first commit author)
  • Generates a Markdown index with links to all fetched READMEs
  • Includes creation and update dates for each repository

Performance Optimization

A significant optimization has been implemented using GitHub's GraphQL API to reduce API calls and improve efficiency. See optimization.md for details on:

  • The original vs. optimized approach
  • Technical implementation with GraphQL
  • Performance benefits
  • Implementation considerations

Development Approach

Terminal Testing First

IMPORTANT: Before implementing any scripting solution, ALWAYS test your approach directly in the terminal first. This principle was critical to discovering the GraphQL optimization in this project.

For example, before implementing the GraphQL solution in Python:

  1. Basic GraphQL query was tested directly with GitHub CLI:

    gh api graphql -f query='query { organization(login: "CelloCommunications") { ... } }'
    
  2. Once the query worked, it was refined interactively:

    gh api graphql -f query='...' | jq '.data.organization.repositories.nodes[] | select(...)'
    
  3. Only after confirming the approach worked in the terminal was it implemented in Python.

This terminal-first approach allows you to:

  • Verify API responses without writing complex code
  • Iterate quickly on query structure
  • Identify potential issues early
  • Understand exactly what data you're working with

Installation

You can install hyperdex directly from PyPI:

# Using UV (the only supported method)
uv tool install hyperdex

IMPORTANT: This project strictly follows a UV-native workflow. The use of pip is strictly prohibited. See UV Workflow for details.

Usage

As a Command-Line Tool

After installation, you can use the command-line tools directly:

# Fetch/Index personal READMEs (excluding forks)
hyperdex-fetch-shane

# Fetch/Index personal FORKED READMEs
hyperdex-fetch-forks-shane

# Fetch/Index organization READMEs authored by you
hyperdex-fetch-cello

# Create releases for latest tags on personal repos (if missing)
hyperdex-release-latest-shane

# Create initial v0.1.0 release for untagged personal repos
hyperdex-release-initial-shane

# Create latest releases for organization repos (Dry Run)
hyperdex-release-latest-cello --dry-run

# Create initial v0.1.0 release for untagged organization repos (Dry Run)
hyperdex-release-initial-cello --dry-run

# Update README files across multiple repositories using a CSV list
hyperdex-update-readmes -c path/to/your/repositories.csv -d path/to/local/readmes/

# Setup GitHub PAT as a repository secret
hyperdex-setup-secret

From Source

Clone this repository:

git clone https://github.com/shaneholloman/hyperdex.git
cd hyperdex

Run the desired script from the tools/ directory:

# Fetch/Index personal READMEs (excluding forks)
python3 tools/gh_repo_fetch_index_shane.py

# Fetch/Index personal FORKED READMEs
python3 tools/gh_repo_fetch_forks_shane.py

# Fetch/Index Cello READMEs authored by you
python3 tools/gh_repo_fetch_index_cello.py

# Create releases for latest tags on personal repos (if missing)
python3 tools/gh_repo_release_latest_shane.py

# Create initial v0.1.0 release for untagged personal repos
python3 tools/gh_repo_release_initial_shane.py

# Create latest releases for Cello repos (Dry Run)
python3 tools/gh_repo_release_latest_cello.py --dry-run

# Create initial v0.1.0 release for untagged Cello repos (Dry Run)
python3 tools/gh_repo_release_initial_cello.py --dry-run

# Update README files across multiple repositories using a CSV list
python3 tools/gh_repo_update_readmes.py -c path/to/your/repositories.csv -d path/to/local/readmes/

# Setup GitHub PAT as a repository secret
python3 tools/gh_repo_setup_secret.py

The fetcher scripts will:

  1. Fetch repositories from GitHub
  2. Filter by authorship (for the Cello script)
  3. Download READMEs from each repository
  4. Generate an index file with links to all fetched READMEs

Requirements

  • Python 3.10+ (required for | type hint syntax)
  • GitHub CLI (gh) installed and authenticated
  • packaging library for Python (install with uv install packaging) for release scripts

Output

The scripts will create subdirectories with fetched READMEs and an index file:

  • github-project-readmes-shane/: For personal repositories (excluding forks)
  • github-project-forks-shane/: For personal forked repositories
  • github-project-readmes-cello/: For organization repositories

Each directory contains:

  • Individual README files renamed according to repository
  • An index.md file with links to all READMEs and metadata

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hyperdex-0.1.1.tar.gz (878.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

hyperdex-0.1.1-py3-none-any.whl (44.3 kB view details)

Uploaded Python 3

File details

Details for the file hyperdex-0.1.1.tar.gz.

File metadata

  • Download URL: hyperdex-0.1.1.tar.gz
  • Upload date:
  • Size: 878.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.6.10

File hashes

Hashes for hyperdex-0.1.1.tar.gz
Algorithm Hash digest
SHA256 01f015943a50bfcad7b9f5b06fc2a79b9c3ad0c375b540831653e103ddf635b7
MD5 742530c7f4c3d1c69e78669f5bfe9205
BLAKE2b-256 dcf35a2156804be0364f15549d40259374540e98f3ad5298ceed24119b16e843

See more details on using hashes here.

File details

Details for the file hyperdex-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: hyperdex-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 44.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.6.10

File hashes

Hashes for hyperdex-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 b59eea63879aa137a5c3bc9943369f129b12e3db04a7c113267f7c03fd5cf031
MD5 4688fcbeb519d8e7dbcd2b61da3535ae
BLAKE2b-256 3c15d2f6283191b8e7a4ef7c39dc62fbb4268bb4088601cbb017904a272d65c0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page