AI-powered project mapping for debugging, tracing, and execution analysis

These details have not been verified by PyPI

Project links

Project description

Context Stream

AI-powered project mapping that provides full details and intents for your local llms. Reads your codebase, understands what every file actually does, and writes the truth back into your source — so both humans and machines stop guessing.

Overview

Context Stream is a local, offline-first code intelligence layer that walks a Python project, parses every file with AST, and uses a local GGUF model (Gemma by default) to generate a one-sentence "intent" for every file, function, class, and method. The result is a single project_summary.json that becomes the nervous system that DebugFlow's surgeon and logger pull from when something breaks.

Built for two audiences:

Developers who want their codebase to self-document and want a machine-readable map of intent + dependencies for every module.
DebugFlow / ML pipelines that need a global "what is this project" context to make crash diagnosis and auto-repair surgical instead of guesswork.

Everything runs locally. No code ever leaves your machine.

How it works — process model

Context Stream runs as a completely separate process from the project it analyzes. It never imports, executes, or links against any of your project's code. It only:

Walks your directory tree with os.walk.
Reads each .py file as plain text.
Parses it with Python's built-in ast module (static analysis only).
Passes code snippets to a local GGUF model for summarization.
Writes JSON output to <project>/context/.

This means you can safely run it against any project — broken, partially installed, or with conflicting dependencies — without any interference in either direction.

Quick Demo

What it looks like in flight

$ context-stream .

═══════════════════════════════════════════════════════
📂 TARGET:      /home/you/projects/my_app
🧠 AI LOGS:     ENABLED
═══════════════════════════════════════════════════════

🔍 Scanning files in: /home/you/projects/my_app
📂 Found 24 Python nodes.
🧠 Cache Loaded: 18 file hashes recognized.
🧬 Synthesizing context: core.py
🧠 LLM (file): Orchestrates the request lifecycle and dispatches to handlers.
✍️ Injected AI Intent: core.py
🧠 Analyzing Project: 100%|████████████| 24/24 [00:42<00:00,  1.75s/file]
💾 State Physically Synchronized: 24 keys.
🏁 Neural Mapping Complete (42.18s).

───────────────────────────────────────────────────────
🏁 SCAN COMPLETE
⏱️  Duration:     42.18s
📄 Total Files:   24
⚡ Cached:        18
🧠 AI Analyzed:   6
───────────────────────────────────────────────────────

After the run, you get a context/project_summary.json at the project root with the full neural map of your code — files, intents, dependencies, classes, methods, the works.

Installation

pip install context_stream

Dependencies (auto-installed):

tqdm — progress bar for the scan loop
llama-cpp-python — runs the local GGUF model
debugflow — sibling package; provides the logger and SpineLink telemetry

Requires Python 3.10+.

You also need a GGUF model on disk. The stream is tuned around google_gemma-3-4b-it-Q5_K_M.gguf, but any chat-tuned GGUF that llama-cpp-python can load will work.

After install, link your model once:

context-stream model-path
# 🎯 Enter absolute path to your GGUF model: /home/you/models/gemma-3-4b-it-Q5_K_M.gguf
# ✨ Configuration saved successfully.

The path is persisted to ~/.context_stream/config.json and reused across every project.

Usage

Option 1 — CLI (recommended)

From the root of any Python project:

context-stream .

The stream will:

Walk every .py file (skipping __pycache__, .git, venv, models, context).
Hash each file and skip anything already cached.
Send only the changed files to the local LLM for re-summarization.
Inject a """File summary: ...""" docstring at the top of any file that doesn't already have one.
Write the full project map to ./context/project_summary.json.

Option 2 — Python module (embed in your own tooling)

from context_stream import ContextStream

stream = ContextStream(project_path=".", logs_on=True, context_logs_on=True)
project_map, stats = stream.run(auto_inject=True)

print(f"Mapped {stats['total_files']} files in {stats['time_taken']:.2f}s")
print(f"Cache hits: {stats['cache_hits']}, AI analyses: {stats['new_analyses']}")

project_map is the same dict written to disk — use it directly without round-tripping through JSON.

The stream instance is just a regular Python object. Creating it inside your own script does not affect your process's imports or environment in any way — it only touches the filesystem paths you give it.

Stopping it

The scan is cooperative. Ctrl+C at any time; the cache is fsync'd after each file so the next run picks up exactly where you left off.

Problem + Motivation

Every non-trivial codebase suffers from the same rot:

Docstrings drift, lie, or never get written.
File names imply one thing while the code does another.
When something crashes deep inside an ML pipeline, the only "context" your debugger has is the traceback — no idea what the surrounding files were supposed to do.

Context Stream fixes this at the root by treating the project itself as the source of truth. Instead of trusting names or stale docstrings, it reads the actual logic of every function and asks a local LLM to summarize what it executes, not what it claims to do. That summary then becomes:

A real, injected docstring at the top of the file.
A node in the global project_summary.json map.
The "neighborhood context" that DebugFlow's surgeon uses when proposing a fix for a crashing file.

The whole pipeline is local, cached, and incremental — so re-running it across a 500-file repo is cheap.

Key Features

Local-first. Runs entirely offline through llama-cpp-python. No API keys, no telemetry, no code leaves the machine.
Process-isolated. Only reads files; never imports or executes your project's code.
Skeptic prompting. The LLM is instructed to ignore misleading names and summarize what the code actually executes.
Hash-based incremental cache. SHA-256 per file; only changed files get re-analyzed.
AST-level extraction. Functions, classes, methods, signatures, and a 50-line logic preview per symbol — not just names.
Auto-injection. Files without a module docstring get a real one written in, derived from the model's intent.
Dependency graph. Every file's imports are mapped into a global graph, exposing the project's nervous system.
DebugFlow integration. Logs route through debugflow.logger_system; the resulting map is consumed by the DebugFlow surgeon for crash repair.
Toggleable AI chatter. Mute the LLM's status logs without touching the rest of DebugFlow's logging.
Crash-safe persistence. State is fsync'd to disk after each scan; partial runs survive interruption.

API Usage / Examples

Mapping a single project

from context_stream import ContextStream

stream = ContextStream("/path/to/project")
project_map, stats = stream.run()

Ignoring framework / boilerplate directories

stream = Contextstream(
    project_path=".",
    ignore_list=["migrations", "tests", "conftest.py"]
)
project_map, stats = stream.run()

Entries in ignore_list are matched against both directory names and file names.

Reading a previously generated map

import json
from pathlib import Path

summary = json.loads(Path("context/project_summary.json").read_text())

print("Project:", summary["project_name"])
for entry in summary["map"]:
    print(f"  {entry['file']}: {entry['intent']}")

Inspecting the dependency graph

import json
from pathlib import Path

summary = json.loads(Path("context/project_summary.json").read_text())

for file, deps in summary["dependencies"].items():
    print(f"{file}")
    for d in deps:
        print(f"   └─ {d}")

Running silently (no AI logs)

stream = ContextStream(
    project_path=".",
    logs_on=True,           # keep DebugFlow's master pipe alive
    context_logs_on=False,  # silence the stream's own chatter
)
stream.run()

Mapping without auto-injecting docstrings

If you want a read-only pass (no file mutations), disable injection:

stream = ContextStream(".")
project_map, stats = stream.run(auto_inject=False)

Switching the model at runtime

from context_stream import set_model_path, get_model_path

set_model_path("/new/path/to/another-model.gguf")
print("Active model:", get_model_path())

Configuration via environment variables

Variable	Purpose	Default
`MODEL_PATH`	Override the GGUF model path (takes precedence over the saved config).	Falls back to `~/.context_stream/config.json`, then to `./models/google_gemma-3-4b-it-Q5_K_M.gguf`

Persistent config is stored in:

~/.context_stream/config.json — global model path.
<project>/context/cache.json — per-project hash cache.
<project>/context/project_summary.json — per-project neural map.
<project>/.context/stream_flow.log — runtime log piped through DebugFlow.

The stream's chatter toggle is persisted at:

<install>/context_stream/.context_log_state — ON / OFF.

Console scripts

Command	What it does
`context-stream <path>`	Run a full scan over the given project path (use `.` for cwd).
`context-stream model-path`	Interactive prompt to link / re-link your GGUF model.
`context-logs`	Toggle the stream's AI chatter ON ↔ OFF (state persists).
`context-logs-on`	Force AI chatter ON.
`context-logs-off`	Force AI chatter OFF (silenced).

You can also override the chatter state inline for a single run:

context-stream . context-logs off

Project map schema

The context/project_summary.json written after each scan has this shape:

{
  "project_name": "my_app",
  "tree": {
    "root": "my_app",
    "structure": [
      { "folder": "", "files": ["main.py", "utils.py"] },
      { "folder": "models", "files": ["data.py"] }
    ]
  },
  "dependencies": {
    "main.py": ["os", "utils", "models.data"],
    "utils.py": ["re", "pathlib"]
  },
  "map": [
    {
      "file": "main.py",
      "intent": "Orchestrates the request lifecycle and dispatches to handlers.",
      "index": {
        "run(args: list) -> None": "Parses CLI args and delegates to the appropriate handler."
      },
      "classes": {
        "App": {
          "intent": "Holds application state and routes incoming requests.",
          "methods": {
            "start(self) -> None": "Initialises the event loop and binds the socket."
          }
        }
      },
      "dependencies": ["os", "utils"],
      "docstring": "-------- main --------\nOrchestrates the request lifecycle.\n-------- main --------"
    }
  ]
}

Project Status

Stable:

AST parsing of functions, classes, methods (signature + docstring + 50-line logic preview).
Local LLM summarization via llama-cpp-python with the skeptic prompt.
Hash-based incremental cache and crash-safe fsync persistence.
Auto-injection of file-level docstrings.
Dependency graph extraction.
CLI (context-stream, model-path) and persistent log-state toggles.
DebugFlow logger integration (debugflow.logger_system, child-logger naming).
ignore_list support for filtering framework noise.

In progress / experimental:

surgeon.operate() — pulls the latest crash from DebugFlow's SpineLink, locates the offending file in the project map, and asks the LLM to propose a patch. Functional end-to-end but treated as experimental until the patching step is hardened.
Richer class-level intent (currently uses the class docstring as the prompt; logic-preview-based class intent is on the bench).

License

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.0.3

May 5, 2026

1.0.2

May 5, 2026

This version

1.0.1

May 5, 2026

1.0.0

May 5, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

context_stream-1.0.1.tar.gz (15.5 kB view details)

Uploaded May 5, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

context_stream-1.0.1-py3-none-any.whl (7.8 kB view details)

Uploaded May 5, 2026 Python 3

File details

Details for the file context_stream-1.0.1.tar.gz.

File metadata

Download URL: context_stream-1.0.1.tar.gz
Upload date: May 5, 2026
Size: 15.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.0

File hashes

Hashes for context_stream-1.0.1.tar.gz
Algorithm	Hash digest
SHA256	`ed9789f9a022ffced283b9240d3ee9efebd241ec8e779d6d44c2c5eccbf8f9a2`
MD5	`7455d0cea0cdc26c3c8ffe913f092d41`
BLAKE2b-256	`ff8061f634a9539d06954ed4e061d7253926eedb8e0991b919b7736c7b929f44`

See more details on using hashes here.

File details

Details for the file context_stream-1.0.1-py3-none-any.whl.

File metadata

Download URL: context_stream-1.0.1-py3-none-any.whl
Upload date: May 5, 2026
Size: 7.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.0

File hashes

Hashes for context_stream-1.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`3d60e305b3d4488225b604ca603733aa33472fecca0e5d3b9fef9e11a2eac05a`
MD5	`343ea059e10b048dcb852b3b71b432b1`
BLAKE2b-256	`ef1b98a3c853974fd23a137c95b546f912b0acd6a7bf7917c5eeee58cbfbb026`

See more details on using hashes here.

context-stream 1.0.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Context Stream

Overview

How it works — process model

Quick Demo

What it looks like in flight

Installation

Usage

Option 1 — CLI (recommended)

Option 2 — Python module (embed in your own tooling)

Stopping it

Problem + Motivation

Key Features

API Usage / Examples

Mapping a single project

Ignoring framework / boilerplate directories

Reading a previously generated map

Inspecting the dependency graph

Running silently (no AI logs)

Mapping without auto-injecting docstrings

Switching the model at runtime

Configuration via environment variables

Console scripts

Project map schema

Project Status

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes