Local-first repo behavior map generator

These details have not been verified by PyPI

Project description

hypergumbo

hypergumbo is a local-first CLI that generates behavior maps and sketches from source code. The goal of this project is to efficiently help developers and LLMs understand a codebase.

pip install hypergumbo

Requires Python 3.10+. For optional extras (embeddings, gitleaks, grammars), run hypergumbo add-extras after installing.

Intel Mac users: Some tree-sitter packages lack x86_64 wheels. See docs/INTEL_MAC.md for a Docker-based workaround.

git clone https://codeberg.org/iterabloom/hypergumbo
hypergumbo hypergumbo/

Output:

# hypergumbo

hypergumbo is a local-first CLI that generates behavior maps and sketches from source code. The goal of this project is to efficiently help developers and LLMs understand a codebase. > Requires Python 3.10+. For optional extras (embeddings, gitleaks, grammars), run `hypergumbo add-extras` after installing. > Intel Mac users:

## Overview
Python (91%), Markdown (4%), Yaml (3%)
728 files    (383 non-test + 345 test)
~320,798 LOC (~129,172 non-test + ~191,626 test)

## Structure

` ` `
hypergumbo/
├── .agent
│   └── [and 6 other items]
├── .gitea
│   ├── SQUASH_TEMPLATE.md
│   └── [and 1 other items]
├── .githooks
│   ├── commit-msg
│   └── [and 9 other items]
├── docs
│   ├── CACHE.md
│   └── [and 22 other items]
├── packages
│   ├── hypergumbo-core
│   │   ├── src
│   │   │   └── hypergumbo_core
│   │   │       ├── analyze
│   │   │       │   ├── base.py
│   │   │       │   └── [and 3 other items]
│   │   │       ├── __main__.py
│   │   │       ├── cli.py
│   │   │       ├── ir.py
│   │   │       └── [and 26 other items]
│   │   ├── tests
│   │   │   ├── test_framework_patterns.py
│   │   │   └── [and 94 other items]
│   │   └── [and 2 other items]
│   ├── hypergumbo-tracker
│   │   ├── src
│   │   │   └── hypergumbo_tracker
│   │   │       ├── cli.py
│   │   │       └── [and 13 other items]
│   │   └── [and 5 other items]
│   └── [and 4 other items]
├── scripts
│   ├── lib
│   │   └── forgejo-api.sh
│   └── [and 33 other items]
├── tests
│   ├── test_bakeoff_deep_reflect.py
│   └── [and 2 other items]
├── conftest.py
├── pyproject.toml
├── setup.py
└── [and 21 other items]
` ` `

## Frameworks

- pytest
- pytorch
- transformers

## Tests

345 test files · cargo test, pytest, unittest

*~95% estimated coverage (2693/2847 functions called by tests)*

## Configuration
[...]

See full example output

Use -t to control the token budget:

hypergumbo . -t 1000   # brief overview (structure only)
hypergumbo . -t 4000   # good balance for most LLMs
hypergumbo . -t 8000   # detailed with many symbols

Two Outputs

Sketch (hypergumbo .) — Token-budgeted Markdown sized for LLM context windows. Ranks symbols by graph centrality (★ = most connected).

Behavior map (hypergumbo run) — Full JSON with all symbols, edges, and provenance tracking. Use this for programmatic analysis.

CLI Commands

hypergumbo [path]              # Markdown sketch (default)
hypergumbo run [path]          # Full JSON behavior map
hypergumbo slice --entry X     # Subgraph from entry point
hypergumbo io-boundaries       # Find all I/O (filesystem, network, subprocess, env, IPC, browser storage)
hypergumbo verify-claims ...   # Verify security claims against analysis
hypergumbo routes [path]       # List HTTP routes
hypergumbo search <query>      # Search symbols
hypergumbo symbols [path]      # Browse symbols with connectivity
hypergumbo explain <symbol>    # Detailed symbol info
hypergumbo test-coverage       # Analyze test coverage (transitive)
hypergumbo catalog             # List analysis passes

Useful flags:

hypergumbo . -x                # exclude test files (cleaner output)
hypergumbo . --no-source       # omit source code (included by default)
hypergumbo . --no-progress     # hide progress indicator (on by default)
hypergumbo --help --all        # comprehensive help for all commands

Project-local taint catalogs

verify-claims ships with paranoid defaults auto-derived from the built-in IO primitive catalog. Projects can supply their own trust zones, sanitizers, and label maps:

hypergumbo verify-claims claims.yaml \
    --taint-sources    myrepo/taint/sources.yaml \
    --taint-sinks      myrepo/taint/sinks/ \
    --taint-sanitizers myrepo/taint/sanitizers.yaml

Each flag accepts a YAML file or a directory (globbed as *.yaml), and is repeatable. The same paths can be declared inside the claims YAML under extra_catalogs: {sources, sinks, sanitizers} — relative paths resolve against the claims-file directory. User entries whose (module, name, kind) triple matches a built-in replace it; sanitizers concatenate.

Results are automatically cached in ~/.cache/hypergumbo/. Just run:

hypergumbo .    # auto-runs analysis if no cache exists, then generates sketch

The cache auto-invalidates when source files change. See docs/CACHE.md for details.

See hypergumbo --help for all options.

What It Understands

Language analyzers: Python, JS/TS, Java, Rust, Go, C/C++, and many more
Linkers: Tier 2 edge-recovery passes across four subcategories — Protocol (HTTP, WebSocket, message queues, SQL), Bridge (JNI, wasm_bindgen, Tauri IPC, language-pair FFI), Framework (gRPC, GraphQL, React components, DI resolution, ORM), Infrastructure (containment, inheritance, module imports). Full catalogue.
Framework patterns: FastAPI, Django, Rails, Spring Boot, Phoenix, Express, and many more
I/O boundary detection: Maps every call chain that reaches the filesystem, network, subprocess, environment, IPC, or browser-local storage — across FFI boundaries
Taint-flow analysis: Traces data from sensitive sources (environment variables, received network input, crypto outputs, key material) to sinks in six trust zones (host_fs, network, host_env, ipc, browser_storage, relay), with sanitizer awareness
Supply chain tiers: Classifies code as first-party, internal, external, or derived for dependency-aware analysis

How It Works

Profile: Scan the repo for languages, file counts, LOC
Analyze: Run language-specific analyzers to extract symbols and edges
Link: Connect symbols across language boundaries (JS fetch → Python route)
Enrich: Detect frameworks via YAML pattern matching
Classify: Assign supply chain tiers (first-party, internal, external, derived)
Trace I/O: Map call chains to I/O boundaries; run taint-flow analysis
Output: Generate Markdown sketch or JSON behavior map

The Internal Representation

All analyzers produce the same IR types:

Symbol: A code element (function, class, method) with name, location, and stable ID
Edge: A relationship between symbols (calls, imports, extends, implements)
Span: Source location (file, line, column)

This uniform IR is what allows all language analyzers and linkers (Protocol / Bridge / Framework / Infrastructure — see ADR-0003-ext) to work together coherently.

Architecture

packages/
├── hypergumbo-core/           # CLI, IR, slice, sketch, linkers
│   └── src/hypergumbo_core/
│       ├── cli.py             # Entry point
│       ├── ir.py              # Symbol, Edge, Span
│       ├── sketch.py          # Token-budgeted Markdown
│       ├── slice.py           # Subgraph extraction
│       ├── linkers/           # Tier 2 edge-recovery passes (Protocol/Bridge/Framework/Infrastructure)
│       └── frameworks/        # Framework detection (YAML patterns)
├── hypergumbo-lang-mainstream/  # Python, JS, Java, Go, Rust, etc.
├── hypergumbo-lang-common/      # Haskell, Elixir, GraphQL, etc.
├── hypergumbo-lang-extended1/   # Zig, Solidity, Agda, etc.
├── hypergumbo-tracker/           # Structured work tracker for agent governance (MPL-2.0)
└── hypergumbo/                  # Meta-package (installs all above)

Key design choices:

Registry pattern: Analyzers and linkers self-register via decorators
Two-pass analysis: First collect symbols, then resolve edges (enables cross-file references)
Provenance tracking: Every edge records which analyzer/linker created it
YAML-driven patterns: Framework detection is declarative, not hardcoded

Development

git clone https://codeberg.org/iterabloom/hypergumbo.git
cd hypergumbo
python3 -m venv .venv && source .venv/bin/activate
./scripts/dev-install
source .venv/bin/activate  # reload to enable pytest alias
pytest                      # runs smart-test (affected tests only)

dev-install installs all packages, git hooks, and the pytest/smart-test wrapper. 100% test coverage required.

See CONTRIBUTING.md for PR workflow (including fork-based workflow for external contributors), smart test selection setup, and coverage requirements. Agent instructions live in AGENTS.md.

License

AGPL-3.0-or-later

Hypergumbo logo

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

5.0.1

May 10, 2026

5.0.0

May 10, 2026

This version

4.1.0

May 8, 2026

4.0.0

May 3, 2026

3.0.0

Apr 29, 2026

2.7.0

Apr 22, 2026

2.6.0

Apr 12, 2026

2.5.1

Apr 5, 2026

2.4.0

Mar 21, 2026

2.3.0

Mar 17, 2026

2.2.1

Mar 16, 2026

2.2.0

Mar 12, 2026

2.1.0

Mar 1, 2026

2.0.2

Feb 1, 2026

2.0.0

Jan 31, 2026

1.2.1

Jan 29, 2026

0.9.1

Jan 9, 2026

0.9.0

Jan 9, 2026

0.6.9

Jan 8, 2026

0.6.0

Dec 29, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hypergumbo-4.1.0.tar.gz (7.2 kB view details)

Uploaded May 8, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

hypergumbo-4.1.0-py3-none-any.whl (6.7 kB view details)

Uploaded May 8, 2026 Python 3

File details

Details for the file hypergumbo-4.1.0.tar.gz.

File metadata

Download URL: hypergumbo-4.1.0.tar.gz
Upload date: May 8, 2026
Size: 7.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for hypergumbo-4.1.0.tar.gz
Algorithm	Hash digest
SHA256	`599a79f3665e70ed6a5457d8d4bfc731b600f468b0039ee5187184ab82c635e2`
MD5	`6bc17ca71bedeb4fcfa2c4972203efb0`
BLAKE2b-256	`ba92cb3e3e6a1f70aad418f2f2b5ec39efef77815ea425cfa6939888e242ed78`

See more details on using hashes here.

File details

Details for the file hypergumbo-4.1.0-py3-none-any.whl.

File metadata

Download URL: hypergumbo-4.1.0-py3-none-any.whl
Upload date: May 8, 2026
Size: 6.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for hypergumbo-4.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`d08867169279d9287d0a47acca910813c1e7268fc8ffecd58d1823603056f67f`
MD5	`84385851a88bee53186e4fd013694ced`
BLAKE2b-256	`a2bdc0bba52a6da3cee69dbe4b05bbcd63ea965419610e38b11745e91f2d4493`

See more details on using hashes here.

hypergumbo 4.1.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

hypergumbo

Two Outputs

CLI Commands

Project-local taint catalogs

What It Understands

How It Works

The Internal Representation

Architecture

Development

Links

License

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes