Skip to main content

Architectural intelligence for AI coding agents โ€” one call gives your editor full context before it edits

Project description

๐Ÿฆ… Hawkeye

Architectural intelligence for AI coding agents. One call gives your AI editor full context about a file โ€” dependencies, blast radius, cycles, health โ€” before it writes a single line.

PyPI Python 3.10+ License: MIT Tests


The Problem

AI coding agents edit files without knowing the architecture. They:

  • Break imports they didn't know existed
  • Refactor classes used by 20 other modules
  • Create circular dependencies
  • Miss that a "simple change" cascades through 34 files

Hawkeye fixes this. It gives the AI the same architectural awareness a senior engineer has โ€” in one deterministic, token-efficient JSON call.


Setup for AI Editors (MCP)

Hawkeye exposes 12 tools via Model Context Protocol. Install and configure in under 60 seconds:

1. Install

pip install hawkeye-analyzer

This installs everything: MCP server, Python/JavaScript/TypeScript analysis.

2. Add to your editor's MCP config

Antigravity / Gemini (~/.gemini/antigravity/mcp_config.json):

{
  "mcpServers": {
    "hawkeye": {
      "command": "hawkeye-mcp",
      "args": []
    }
  }
}

No --project needed โ€” the agent calls hawkeye_analyze(project_path) dynamically with whatever workspace is active. Works for any project without config changes.

Gemini CLI (~/.gemini/settings.json):

{
  "mcpServers": {
    "hawkeye": {
      "command": "hawkeye-mcp",
      "args": []
    }
  }
}

Claude Code (~/.claude/claude_desktop_config.json):

{
  "mcpServers": {
    "hawkeye": {
      "command": "hawkeye-mcp",
      "args": ["--project", "/path/to/your/project"]
    }
  }
}

Cursor (.cursor/mcp.json in project root):

{
  "mcpServers": {
    "hawkeye": {
      "command": "hawkeye-mcp",
      "args": ["--project", "."]
    }
  }
}

Windsurf / Other MCP clients โ€” same pattern. The server uses stdio transport.

Two modes: Pass --project /path to pre-analyze on startup (faster first query, locked to one project). Or pass no args and call hawkeye_analyze() on demand (works for any project, ~5s on first use). Multi-project caching is supported โ€” switching projects doesn't require re-analysis.

3. Done

The AI editor now has access to 12 architectural intelligence tools. The most important one:

hawkeye_file_context("src/core/engine.py")

Returns everything the agent needs in one call:

{
  "module": "myapp.core.engine",
  "file": "core/engine.py",
  "loc": 340,
  "dependency_count": 3,
  "dependent_count": 8,
  "dependencies": [
    {"module": "myapp.core.scanner", "file": "core/scanner.py"},
    {"module": "myapp.core.analyzer", "file": "core/analyzer.py"}
  ],
  "dependents": [
    {"module": "myapp.cli", "file": "cli.py"},
    {"module": "myapp.server.mcp", "file": "server/mcp.py"}
  ],
  "impact": {"direct": 8, "transitive": 12},
  "metrics": {
    "ca": 8, "ce": 3, "instability": 0.273,
    "health": "critical",
    "cyclomatic_complexity": 45,
    "cognitive_complexity": 32
  },
  "insights": ["extreme_cyclomatic", "critical_blast_radius"],
  "risk": "hub",
  "cycles": [],
  "git": {
    "commits": 8,
    "lines_changed": 420,
    "days_since_change": 2,
    "churn": "hot"
  }
}

Design Principles

Hawkeye is built specifically for AI agent consumption:

Principle How
Deterministic Same code โ†’ same output. No randomness, no LLM in the loop. Pure AST + graph algorithms.
Token-efficient Compact mode (default) strips verbose fields. A healthy module adds ~5 tokens. A problematic one adds ~30. Zero wasted tokens on modules with no issues.
One-call context hawkeye_file_context replaces 5+ separate queries. One tool call = full architectural picture.
Fast Single-pass AST parsing. 281 modules analyzed in ~5 seconds. Results cached for the session.
Lightweight Pure Python AST for Python, tree-sitter for JS/TS. Minimal dependencies, fast install.
Machine-readable Every output is structured JSON. Insight codes are enumerated strings, not natural language. Risk profiles are single-token labels.

Token Budget

Scenario Tokens added to context
Healthy module, no issues ~80 tokens
Module with warnings ~150 tokens
Critical module with cycles ~250 tokens
Batch context (3 files) ~400 tokens
Git block in file context ~17 tokens
Hotspot ranking (5 files) ~262 tokens

Compare this to dumping raw import statements or grep results โ€” Hawkeye gives the AI structured, pre-analyzed architectural data at a fraction of the token cost.


MCP Tools Reference

After calling hawkeye_analyze(project_path) once, all other tools are available:

Tool Purpose When to use
hawkeye_file_context(file) Everything about a file โ€” deps, dependents, impact, cycles, health, insights, risk. Supports min_severity filter. Before editing any file
hawkeye_context(files) Combined context for multi-file edits โ€” shared deps, combined blast radius Before editing 2+ related files
hawkeye_impact(file, symbol) Symbol-level blast radius, hotspots, or unused detection (framework-aware) Before renaming/refactoring a class or function
hawkeye_symbols(file) List all classes/functions with usage counts and decorators Understanding what a module exports
hawkeye_find(pattern) Search modules by name Discovering module names
hawkeye_cycles() All import cycles with severity, kind, and break suggestions Checking for circular dependencies
hawkeye_metrics(sort_by, limit) Coupling + complexity table for all modules Finding the riskiest modules
hawkeye_path(source, target) Shortest dependency path between two modules Understanding how modules are connected
hawkeye_hotspots(limit, days) Rank files by complexity ร— git churn โ€” the real risk Finding files that are both complex AND actively changing
hawkeye_graph(max_depth) Full dependency graph as JSON (auto-caps at 80+ modules) Structural overview

Recommended Agent Workflow

1. hawkeye_analyze("/path/to/project")     โ€” scan once on startup
2. hawkeye_file_context("file_to_edit.py") โ€” before every edit
3. hawkeye_impact("file.py", "ClassName")  โ€” before refactoring a symbol
4. hawkeye_cycles()                         โ€” after creating new imports

Interpreting the Output

Insight Codes

Machine-readable labels derived deterministically from metrics. No natural language, no ambiguity:

Code Severity What it means
high_instability warning Many outgoing deps, few incoming โ€” volatile
highly_stable info Many incoming deps โ€” changes here propagate widely
high_efferent warning Depends on too many modules
high_afferent warning Too many modules depend on this
extreme_cyclomatic critical Very high branching complexity (CC โ‰ฅ 50)
extreme_cognitive critical Deeply nested control flow (Cog โ‰ฅ 50)
high_cyclomatic warning Elevated branching complexity (CC โ‰ฅ 20)
high_cognitive warning Moderately nested control flow (Cog โ‰ฅ 25)
critical_blast_radius critical โ‰ฅ10 modules directly depend on this
high_blast_radius warning โ‰ฅ5 modules directly depend on this
very_large_module warning โ‰ฅ500 LOC
in_cycle critical/warning Involved in an import cycle
zone_of_pain warning Concrete + stable = rigid, hard to extend
zone_of_uselessness warning Abstract + unstable = possibly unused abstractions
well_balanced info On the main sequence (good A/I balance)
isolated info No internal dependencies or dependents
high_fan_out info Imports many modules (high coordination surface)
wide_transitive_reach info Transitive impact much wider than direct

Risk Profiles

Single-token classification of a module's structural role:

Label Meaning Agent should...
hub High dependents + high complexity Edit with extreme care โ€” many things break
tangled Involved in import cycles Fix the cycle before adding more imports
fragile High complexity + high instability Likely to break โ€” add tests first
volatile High instability + many outgoing deps Unstable foundation โ€” minimize changes
amplifier Changes cascade widely (transitive >> direct) Check transitive dependents before editing
null No structural risk Safe to edit freely

Health Labels

Five-level composite assessment (monotonic severity):

Label Emoji Meaning
healthy โœ… No coupling or complexity concerns
moderate ๐ŸŸก Mild elevation in one dimension
elevated ๐ŸŸ  Notable complexity or coupling
high ๐Ÿ”ด High risk in multiple dimensions
critical ๐Ÿ”ฅ Extreme values โ€” needs decomposition
unknown โ“ File could not be parsed (syntax error)

CLI for Humans

Hawkeye also works as a standalone CLI:

pip install hawkeye-analyzer

# Full project analysis
hawkeye analyze ./myproject

# Interactive dependency graph in your browser
hawkeye show ./myproject

# Metrics deep-dive with per-function complexity
hawkeye metrics ./myproject --sort health --functions

# Symbol blast radius
hawkeye impact ./myproject src/engine.py -s Engine

# CI gate โ€” fails on rule violations or import cycles
hawkeye check ./myproject --no-cycles

# AI-ready JSON context
hawkeye context ./myproject src/engine.py

# Git hotspots โ€” complexity ร— churn
hawkeye hotspots ./myproject
hawkeye hotspots ./myproject --days 30 --limit 10

Output Formats

Command Formats
hawkeye analyze --format text (default), json, html, dot
hawkeye metrics text (default), --json, --functions
hawkeye impact text (default), --json, --hotspots, --unused (framework-aware)
hawkeye context JSON only (designed for machine consumption)
hawkeye hotspots text (default), --json, --days N, --limit N

Configuration

Place a hawkeye.toml in your project root. Hawkeye auto-discovers it by walking up from the project directory.

.hawkeyeignore

For quick exclusions without editing TOML, create a .hawkeyeignore file in your project root:

# Tests and fixtures
*.tests.*
*.test_*
conftest

# Generated code
*.generated.*
*.pb2

Each non-blank, non-comment line is treated as a glob exclude pattern. Patterns are merged with any exclude_patterns from hawkeye.toml.

Minimal hawkeye.toml

[project]
name = "MyProject"

[scan]
exclude_patterns = ["*.tests.*", "*.test_*"]  # Keep test modules out of coupling analysis

Architecture Rules

# Enforce layered architecture
[rules.layers]
order = ["models", "services", "api", "cli"]
direction = "downward"

# Block specific imports
[[rules.forbidden]]
from = "api.*"
to = ["cli.*", "scripts.*"]

# Module groups must be independent (transitive โ€” catches indirect paths too)
[[rules.independence]]
modules = ["auth", "billing", "notifications"]

# Only auth may import secrets
[[rules.protected]]
modules = ["core.secrets", "core.tokens"]
allowed_importers = ["auth.*"]

# Sibling services must not form cycles
[[rules.acyclic_siblings]]
ancestor = "services"

Framework Detection

Hawkeye automatically detects framework entry points โ€” symbols decorated with @app.get(), @pytest.fixture, @celery_app.task, etc. These are excluded from unused symbol detection to eliminate false positives.

The built-in registry covers pytest, FastAPI, Flask, Django, Celery, Click, SQLAlchemy, and standard library decorators. Add project-specific patterns in your TOML config:

[scan.framework_decorators]
add = ["my_framework.endpoint", "register_handler"]  # merged with defaults
# replace = true    # set true to fully override defaults

Threshold Tuning

All 18 thresholds are configurable. Choose a profile, then override individual values:

[thresholds]
profile = "strict"    # "default", "strict", or "relaxed"
cc_critical = 40      # Override: relax cyclomatic critical for this project
loc_critical = 600    # Override: allow larger modules
Profile CC warn/crit Cog warn/crit LOC warn/crit Dependents warn/crit
default 20 / 50 25 / 50 300 / 500 5 / 10
strict 10 / 30 15 / 30 200 / 300 3 / 5
relaxed 30 / 80 40 / 80 500 / 1000 10 / 20

The active profile is embedded in JSON output (threshold_profile field) for reproducibility.

All 18 threshold keys
Key Default Controls
instability_high 0.8 high_instability insight trigger
instability_low 0.2 highly_stable insight trigger
ce_high 8 Efferent coupling warning
ca_high 8 Afferent coupling warning
cc_high 20 Cyclomatic โ†’ warning
cc_critical 50 Cyclomatic โ†’ critical
cog_high 25 Cognitive โ†’ warning
cog_critical 50 Cognitive โ†’ critical
loc_high 300 large_module insight
loc_critical 500 very_large_module insight
dependents_high 5 Blast radius โ†’ warning
dependents_critical 10 Blast radius โ†’ critical
dependencies_high 6 high_fan_out insight
cycle_size_high 4 Cycle โ†’ critical severity
distance_high 0.5 Zone of pain / uselessness trigger
distance_low 0.2 well_balanced trigger
abstract_high 0.8 Highly abstract classification
abstract_low 0.2 Concrete classification

How It Works

Source files (Py/JS/TS) โ†’ Language-specific parsing โ†’ Import resolution โ†’ Dependency graph
                                                                      โ†“
                    Symbol registry โ† Symbol extraction     Graph algorithms
                         โ†“                                        โ†“
                  Symbol-level impact              Coupling metrics (Ca/Ce/I/A/D)
                  Hotspot detection                Complexity metrics (CC/Cog)
                  Dead code detection (fw-aware)   Cycle detection (Tarjan's SCC)
                                                   Import classification
                                                   Health classification
                                                   Insight derivation
                                                         โ†“
                                              Deterministic JSON output
  • Single AST pass per file โ€” no re-parsing, no multiple traversals
  • Tarjan's SCC for cycle detection โ€” O(V+E), mathematically optimal
  • Import classification โ€” distinguishes runtime, TYPE_CHECKING, and deferred imports for intelligent cycle triage
  • BFS reachability for transitive impact โ€” cached per session
  • Robert C. Martin's metrics โ€” Ca, Ce, Instability, Abstractness, Distance
  • SonarSource spec for cognitive complexity โ€” nesting-weighted, not just branch counting
  • LOC = code lines only โ€” blank lines and # comment lines are excluded. A file with 1,800 raw lines may report ~1,400 LOC. This is the more useful metric for complexity assessment

Data Storage

All analysis data lives in RAM only. There is no database, no cache file, no .hawkeye/ directory. The MCP server holds the dependency graph, metrics, and symbol registry in-process for the duration of the session. When the server stops (editor closes), all data is discarded. Next session re-analyzes from scratch โ€” which takes ~5 seconds for a 300-module project.


Performance

Metric Value
281 modules, 58K LOC ~5 seconds full analysis
Incremental queries after analysis <10ms per call
Memory Graph + metrics cached in-process
Install time ~5 seconds
MCP server startup with pre-analysis ~6 seconds

Project Structure

src/hawkeye/
โ”œโ”€โ”€ engine.py           # Central orchestrator (CC=36, 261 LOC)
โ”œโ”€โ”€ context.py          # AI context builder (stateless, pure functions)
โ”œโ”€โ”€ config.py           # TOML config with walk-up discovery
โ”œโ”€โ”€ cli/                # CLI subpackage
โ”‚   โ”œโ”€โ”€ __init__.py     # Parser + main() entry point
โ”‚   โ”œโ”€โ”€ commands.py     # 7 command handlers
โ”‚   โ”œโ”€โ”€ _helpers.py     # Engine creation + UTF-8 setup
โ”‚   โ””โ”€โ”€ __main__.py     # python -m support
โ”œโ”€โ”€ core/
โ”‚   โ”œโ”€โ”€ models.py       # Leaf: ModuleInfo + utilities (I=0.125)
โ”‚   โ”œโ”€โ”€ scanner.py      # File discovery
โ”‚   โ”œโ”€โ”€ analyzer.py     # AST imports + symbols + complexity
โ”‚   โ”œโ”€โ”€ graph.py        # Directed graph + algorithms
โ”‚   โ”œโ”€โ”€ metrics.py      # Ca/Ce/I/A/D + health scoring
โ”‚   โ”œโ”€โ”€ cycles.py       # Tarjan's SCC + severity + kind
โ”‚   โ”œโ”€โ”€ rules.py        # 5 architecture rule types
โ”‚   โ”œโ”€โ”€ insights.py     # Deterministic insight derivation
โ”‚   โ”œโ”€โ”€ git_history.py  # Git churn, hotspots, rename tracking
โ”‚   โ””โ”€โ”€ symbols.py      # Cross-file symbol resolution
โ”œโ”€โ”€ languages/          # Multi-language support
โ”‚   โ”œโ”€โ”€ base.py         # Adapter protocol
โ”‚   โ”œโ”€โ”€ registry.py     # Adapter factory
โ”‚   โ”œโ”€โ”€ python/         # Python adapter
โ”‚   โ”œโ”€โ”€ javascript/     # JavaScript adapter
โ”‚   โ”œโ”€โ”€ typescript/     # TypeScript adapter
โ”‚   โ””โ”€โ”€ shared/         # Tree-sitter JS/TS parsing engine
โ”œโ”€โ”€ server/
โ”‚   โ””โ”€โ”€ mcp.py          # 12 MCP tools
โ””โ”€โ”€ visualizer/
    โ”œโ”€โ”€ html_renderer.py    # Interactive D3.js graph
    โ”œโ”€โ”€ dot_renderer.py     # Graphviz DOT
    โ”œโ”€โ”€ text_renderer.py    # Terminal tables
    โ””โ”€โ”€ json_renderer.py    # Structured JSON

62 modules, 9,640 LOC, 0 import cycles. 350 tests across 12 test files. Python 3.10+.

License

MIT โ€” see LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hawkeye_analyzer-0.5.1.tar.gz (112.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

hawkeye_analyzer-0.5.1-py3-none-any.whl (95.0 kB view details)

Uploaded Python 3

File details

Details for the file hawkeye_analyzer-0.5.1.tar.gz.

File metadata

  • Download URL: hawkeye_analyzer-0.5.1.tar.gz
  • Upload date:
  • Size: 112.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for hawkeye_analyzer-0.5.1.tar.gz
Algorithm Hash digest
SHA256 1d10d4992a0336b704a6a6077c1bcfa6317bc40ddbc84433c9b98c67ae20345c
MD5 f75e07912a5b49f326a5945c29c339b0
BLAKE2b-256 26abebdef1acbca01723928ce4150c520f8797c7144f5e46f90cfd24bc79b2d7

See more details on using hashes here.

Provenance

The following attestation bundles were made for hawkeye_analyzer-0.5.1.tar.gz:

Publisher: ci.yml on AlexxBenny/Hawkeye-analyze-your-codebase

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file hawkeye_analyzer-0.5.1-py3-none-any.whl.

File metadata

File hashes

Hashes for hawkeye_analyzer-0.5.1-py3-none-any.whl
Algorithm Hash digest
SHA256 474a575f7ca55e10c7bb5e8209000c02520e60bc56d67b99d1d672d904a454f7
MD5 a658de276bd72ca23564fe8b1ac32ac5
BLAKE2b-256 fe21c95a4ec921f337f276eb68d5ae93fb64e90a757101416c4967257e6a0e6e

See more details on using hashes here.

Provenance

The following attestation bundles were made for hawkeye_analyzer-0.5.1-py3-none-any.whl:

Publisher: ci.yml on AlexxBenny/Hawkeye-analyze-your-codebase

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page