Skip to main content

Turn shell sessions into production runbooks using a swarm of AI agents.

Project description

SHELLSTORY

Autonomous Runbook Synthesis from Live Terminal Sessions

Architecture | Installation | Usage | Agent Swarm | Security | Configuration | Contributing


The Problem

Every deployment, migration, and infrastructure task begins in the terminal. Engineers run dozens of commands, hit errors, fix them, and eventually arrive at a working state. The knowledge of what worked and why lives only in scroll-back history --- ephemeral, unstructured, and lost the moment the terminal closes.

Teams compensate by writing runbooks after the fact. These documents are invariably incomplete, out of date within weeks, and missing the critical troubleshooting steps that made the procedure actually work.

The Solution

ShellStory eliminates the gap between execution and documentation. It captures every command, exit code, and output in real-time, then deploys a coordinated swarm of specialized LLM agents to synthesize the raw session into a structured, production-grade runbook --- complete with prerequisites, logical step grouping, error analysis, and remediation guidance.

The operator simply works. ShellStory writes the manual.


Architecture

ShellStory is built on a three-tier architecture designed for zero data loss and minimal processing latency.

                              CAPTURE TIER
  +-----------+        +------------------+       +----------------+
  |  Terminal  | -----> |  Hook Script     | ----> |  .ndjson Log   |
  |  (User)    |        |  (PowerShell/    |       |  (Append-only  |
  |            |        |   Bash/Zsh)      |       |   event stream)|
  +-----------+        +------------------+       +-------+--------+
                                                          |
                              DAEMON TIER                  |
                       +----------------------+            |
                       |  Background Daemon   | <----------+
                       |  - 30s rolling window|
                       |  - PII pre-scan      |
                       |  - Signal extraction  |
                       +----------+-----------+
                                  |
                                  v
                       +----------+-----------+
                       |  State Checkpoint    |
                       |  (.state.json)       |
                       +----------+-----------+
                                  |
                              SWARM TIER                   
                       +----------+-----------+
                       |  Orchestrator        |
                       |  - Tail catch-up     |
                       |  - Agent lifecycle   |
                       |  - Checkpoint mgmt   |
                       +----------+-----------+
                                  |
              +-------------------+-------------------+
              |                   |                   |
     +--------v------+  +--------v------+  +---------v-----+
     | Signal Agent   |  | Failure Agent  |  | Prereq Agent  |
     | (Parallel)     |  | (Parallel)     |  | (Parallel)    |
     +--------+------+  +--------+------+  +---------+-----+
              |                   |                   |
              +-------------------+-------------------+
                                  |
                       +----------v-----------+
                       |  Sequence Agent      |
                       |  (Serial, depends    |
                       |   on Signal output)  |
                       +----------+-----------+
                                  |
                       +----------v-----------+
                       |  Merger Agent         |
                       |  (Final assembly)     |
                       +----------+-----------+
                                  |
                       +----------v-----------+
                       |  Validation Layer     |
                       |  (Repair loop)        |
                       +----------+-----------+
                                  |
                       +----------v-----------+
                       |  Markdown Export      |
                       |  - Runbook (.md)      |
                       |  - Raw Transcript     |
                       +----------------------+

Capture Tier

A lightweight shell hook (generated per-session for PowerShell, Bash, or Zsh) intercepts the PreCommandAction and PostCommandAction lifecycle events. Each command, its exit code, working directory, and timing data are serialized as NDJSON and appended to an immutable event log. The hook uses low-level file I/O ([System.IO.File]::AppendAllText on Windows) to avoid file-lock contention with the daemon tier.

Daemon Tier

A detached background process (DETACHED_PROCESS | CREATE_NEW_PROCESS_GROUP on Windows, start_new_session=True on POSIX) monitors the session log in 30-second rolling windows. During each window, it performs:

  • PII pre-scan using the regex redaction engine
  • Signal extraction via the Signal Agent (identifying "happy path" commands vs. noise)
  • Failure analysis via the Failure Agent

Results are checkpointed to .state.json, enabling the Swarm Tier to skip redundant computation.

Swarm Tier

When the operator finalizes a session, the Orchestrator performs a tail catch-up: it compares the daemon's _processed_count against the total event count and processes any unhandled tail events inline. This eliminates the race condition between the daemon's last processing window and the user's exit command, guaranteeing 100% event fidelity.


Agent Swarm

The swarm consists of six specialized agents, each with a single responsibility. Agents that are data-independent run in parallel with staggered launch times (3-second intervals) to avoid rate-limit bursts. Sequential agents consume the outputs of the parallel tier.

Agent Execution Responsibility
Signal Parallel Classifies commands as signal (essential) or noise (debugging, navigation, typos). Only signal commands propagate to the Sequence Agent.
Failure Parallel Identifies failed commands (non-zero exit codes), correlates them with subsequent recovery attempts, and extracts the verified fix.
Prereq Parallel Infers environmental prerequisites: runtime versions, system packages, required ports, and environment variables.
Sequence Serial Receives the filtered signal commands and groups them into logical, ordered runbook steps with dependency-aware sequencing.
Annotation Serial Reviews draft steps for destructive operations, race conditions, or practical pitfalls and attaches contextual warnings.
Merger Serial Assembles outputs from all upstream agents into a unified runbook structure with professional titles, descriptions, and troubleshooting sections.

Anti-Hallucination Constraints

Every agent prompt includes explicit negative constraints:

"You MUST NOT invent, hallucinate, or add any commands that are not in the raw list provided. Use EXACTLY the commands given."

The Merger Agent operates under a strict assembly-only mandate --- it formats and refines language but cannot introduce new commands. If any agent returns an empty or malformed response, the Orchestrator falls back to a deterministic, programmatic merge using chronological ordering.

Graceful Degradation

Each agent implements a _fallback_result() method. If an LLM call fails after exhausting all retry attempts and fallback models, the agent returns a safe default (e.g., "all commands are signal") rather than crashing the pipeline. The Resilient LLM Client automatically rotates through a fallback chain of free-tier models:

google/gemma-4-31b-it:free -> deepseek/deepseek-v4-flash:free -> ...

Security and PII Redaction

ShellStory treats credential exposure as a first-class failure mode. The redaction engine operates in two layers, both executing before any data reaches the LLM swarm.

Layer 1: Deterministic Regex Engine

A curated set of 13 pattern matchers targets known credential formats:

Pattern Example Match
AWS Access Key AKIAIOSFODNN7EXAMPLE
AWS Secret Key 40-character base64 strings
OpenRouter / OpenAI Keys sk-or-v1-..., sk-...
GitHub / GitLab Tokens ghp_..., glpat-...
Bearer Tokens Authorization: Bearer ...
Database Connection Strings postgres://user:pass@host/db
SSH Private Key Headers -----BEGIN RSA PRIVATE KEY-----
Embedded URL Credentials https://admin:secret@host
CLI Password Arguments --password=mysecret

Each match is replaced with a deterministic variable ($AWS_ACCESS_KEY, $OPENAI_KEY, etc.) and recorded in a VariableDefinition that appears in the final runbook's "Environment Variables" section.

Layer 2: AI-Assisted Scanner

A dedicated PII Scanner Agent reviews the full session transcript for context-dependent secrets that evade regex: internal hostnames, non-standard token formats, database names revealing business logic, and file paths containing personal information.


Installation

Requirements: Python 3.11 or higher.

pip install shellstory

For development:

git clone https://github.com/Ayushpani/shellstory.git
cd shellstory
pip install -e ".[dev]"

This registers the shellstory command globally.


Quickstart

Option A: Environment Variable (Recommended)

Set your API key once and start immediately. No config file required.

# PowerShell
$env:OPENROUTER_API_KEY="sk-or-v1-..."

# Bash / Zsh
export OPENROUTER_API_KEY="sk-or-v1-..."

Option B: Interactive Configuration

Run the setup wizard to persist settings to ~/.shellstory/config.yaml.

shellstory configure

2. Record a Session

Start a new capture session. ShellStory spawns a hooked sub-shell automatically --- no manual hook activation required.

shellstory start "Kubernetes Cluster Migration"

3. Work Normally

Execute commands as you normally would. Every command, its exit code, timing, and working directory are captured transparently.

4. Finalize

Type exit to close the recording shell. The CLI confirms the session ID and prompts you to process.

5. Generate Documentation

Launch the Swarm Orchestrator to synthesize the runbook. The Swarm Matrix dashboard provides real-time visibility into agent execution, model selection, and processing state.

shellstory process

Two files are generated:

  • [title].md --- The structured, AI-synthesized runbook with prerequisites, logical steps, and troubleshooting.
  • [title]-raw.md --- A verbatim command transcript with pass/fail status markers.

Additional Commands

shellstory list              # Display all recorded sessions
shellstory status            # Show the active session and event count
shellstory export <id>       # Re-export a previously processed session
shellstory stop              # Manually stop an active session (alternative to exit)

Configuration

Configuration is stored at ~/.shellstory/config.yaml. See .shellstory.example.yaml for the full schema.

llm:
  provider: "openrouter"
  model: "anthropic/claude-sonnet-4"
  api_key: "YOUR_OPENROUTER_API_KEY"

default_connector: markdown

sessions_dir: "~/.shellstory/sessions"

connectors:
  markdown:
    output_dir: "~/runbooks"

Supported LLM Providers

Provider Configuration
OpenRouter provider: openrouter with an OpenRouter API key
NVIDIA NIM provider: nvidia with a NVIDIA Build API key

The Resilient LLM Client handles rate limiting, exponential backoff, and automatic model fallback across provider-specific error codes.


Project Structure

shellstory/
  agents/
    base.py           # Abstract agent with LLM call and JSON parsing
    specialists.py    # Signal, Failure, Prereq, Sequence, Annotation, Merger
    swarm.py          # Orchestrator with parallel execution and checkpointing
  llm/
    base.py           # Provider-agnostic LLM interface
    resilient.py      # Fallback chain with retry logic
    openrouter.py     # OpenRouter HTTP client
    nvidia.py         # NVIDIA NIM HTTP client
  utils/
    ndjson.py         # Event log serialization and deserialization
    retry.py          # Exponential backoff decorator and JSON response parser
  capture.py          # Shell hook generation (PowerShell, Bash, Zsh)
  cli.py              # Click-based CLI with all commands
  config.py           # YAML configuration management
  connectors.py       # Export connectors (Markdown, extensible)
  daemon.py           # Background processing daemon
  db.py               # SQLite session and runbook persistence
  models.py           # Pydantic models for the entire data pipeline
  redact.py           # Dual-layer PII redaction engine
  ui.py               # Rich-based Swarm Matrix TUI dashboard
  validate.py         # Post-synthesis validation and repair loop

Contributing

Contributions are welcome. Areas of particular interest:

  • Additional shell hooks (Fish, Nushell, cmd.exe)
  • Export connectors (Notion, Confluence, Obsidian)
  • Agent prompt engineering (improving step grouping accuracy, reducing LLM token usage)
  • Provider integrations (Gemini, Groq, Anthropic direct, local models via Ollama)

Please ensure all contributions maintain zero-emoji aesthetics in user-facing output and adhere to the existing code conventions.


License

MIT License. See LICENSE for details.


Built by Ayush Pani

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

shellstory-0.1.0.tar.gz (60.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

shellstory-0.1.0-py3-none-any.whl (59.3 kB view details)

Uploaded Python 3

File details

Details for the file shellstory-0.1.0.tar.gz.

File metadata

  • Download URL: shellstory-0.1.0.tar.gz
  • Upload date:
  • Size: 60.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for shellstory-0.1.0.tar.gz
Algorithm Hash digest
SHA256 753d25bfdef14c9be8cefd15856eb71b3dc19813452e4ccb9ac000c3375d0883
MD5 034e82ba76f3f7c7f8e32b115e9c5c91
BLAKE2b-256 58a3316a09f1876fce774b31babb265fe36f6de278943aab034f9a26c4ff4971

See more details on using hashes here.

File details

Details for the file shellstory-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: shellstory-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 59.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for shellstory-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 4d228623656866822cf830c2e0efa3e58b94ce11e49ea1133298289f03adb4e8
MD5 eb8e7264916228f906cfa57d9846bc0c
BLAKE2b-256 ecd6ccc04f7fdd42936b8f8ac3b8f911e921df1efd83faa59b37e393e28bf888

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page