Skip to main content

An on-call engineer that never sleeps.

Project description

Quell — an on-call engineer that never sleeps

Latest release CI status 302 tests passing mypy strict Apache 2.0 Python 3.12+ GitHub stars

Website · Getting started · Commands · Architecture · Extend · Community


[!NOTE] Quell watches your production logs, investigates incidents via LLM-backed agents running inside a Docker sandbox, and produces a structured root-cause report with a proposed fix. Your engineers wake up to a draft PR, not a page.

Why Quell?

Before Quell

  • 3am alert, bleary-eyed engineer.
  • 10 minutes finding the right log file.
  • Another 15 grepping for the stack trace.
  • Another 20 tracing through five services.
  • By the time root cause is clear, the incident is an hour old.
  • The draft PR gets written tomorrow (if at all).

With Quell

  • Alert fires, autonomous investigation starts.
  • IncidentCommander reads logs, greps code, traces git history.
  • Specialist subagents work in parallel via asyncio.Queue.
  • A root-cause report lands in ~30 seconds.
  • Draft PR proposal waits for human review when you sign on.
  • You approve, merge, go back to sleep.

[!IMPORTANT] Quell never auto-merges. Every proposed fix is a draft PR that requires a human to review and merge. Safety over speed.


See it in action

Quell watching a log file, detecting an incident, and investigating it autonomously

No media yet? Here is the textual storyboard of the same run.
$ quell watch
10:02:45  INFO  monitor: tailing /var/log/my-app/error.log
10:02:47  ERROR TypeError: Cannot read properties of null (reading 'id')
                  at processOrder (src/checkout.ts:42:18)
10:02:47  INFO  detector: new signature 7a9e42f8 -- severity=high
10:02:47  INFO  commander: spawning incident_commander (5 skills matched)
10:02:49  INFO  tool: code_read src/checkout.ts lines 40-50
10:02:52  INFO  tool: git_blame src/checkout.ts:42
10:02:58  INFO  agent: finish_incident -- null-deref on order.user
incident inc_a1b2c3 resolved in 13s -- see `quell show inc_a1b2c3`

Install

Pick the channel that fits your environment. All five install the same binary.

curl

curl -fsSL https://raw.githubusercontent.com/bhartiyaanshul/quell/main/install.sh | bash

Any POSIX shell. Probes for a prebuilt binary first, falls back to pipx + source automatically. Works today.

npm

npm i -g quell

For JavaScript-oriented developers. The postinstall hook downloads the native binary, so nothing on the Python side is needed.

Homebrew

brew install bhartiyaanshul/quell/quell

For macOS and Linux brew users. Installs into /opt/homebrew/bin (Apple silicon) or /usr/local/bin (Intel / Linux).

pipx

pipx install quell

For Python users who already have pipx. Creates an isolated venv under ~/.local.

Standalone binary

curl -sSL https://github.com/bhartiyaanshul/quell/releases/latest/download/quell-$(uname -s)-$(uname -m).tar.gz \
  | tar xz -C /usr/local/bin

No runtime at all. The archive bundles CPython and every dependency. Available for macOS arm64, Linux x86_64, and Windows x86_64.

[!TIP] All five install paths are wired through the release pipeline; the curl one-liner is the most universally exercised. Full cascade in packaging/README.md.


Quick start

Four commands, about two minutes.

cd ~/src/my-app   # the repo you want Quell to watch
quell init        # interactive wizard -- stores API key in OS keychain
quell doctor      # verify Python, git, Docker, and your API key
quell watch       # start monitor -> detector -> agent loop

quell init — interactive setup wizard

quell init interactive wizard

Detects your project type, asks for an LLM provider, stores the API key in your OS keychain, writes .quell/config.toml.

quell doctor — environment check

quell doctor output with every check green

Verifies Python 3.12+, git, Docker daemon, config parse, and LLM key. One coloured table with exit code 0/1 for CI use.

quell history — recent incidents

quell history table output

Reverse-chronological list of every incident Quell has investigated. Filter with --limit.

quell show <id> — incident detail

quell show detailed incident view

Root cause, evidence, proposed fix, occurrence count, first/last seen timestamps, linked PR URL.

quell dashboard — local web UI v0.2

Boots a Next.js + FastAPI dashboard at http://127.0.0.1:7777 with incident list, run timelines, findings, and aggregate stats. Auto-opens your browser; pass --no-open for CI.

quell replay <id> — terminal timeline v0.2

Prints the full event stream for a past investigation — every LLM call, tool call, latency, cost, and error — as a chronological timeline. Same data the dashboard renders interactively.

quell test-notifier <channel> — verify webhooks v0.2

Fires a synthetic incident through Slack / Discord / Telegram so you can confirm webhook URLs and bot tokens are wired up before real traffic hits.

 

Full walkthrough in docs/getting-started.md.


How it works

Quell pipeline: monitor -> detector -> incident commander -> sandbox -> report

flowchart LR
  subgraph Sources
    LF[local-file]
    HP[http-poll]
    VC[vercel]
    SN[sentry]
  end
  LF --> M[Monitor]
  HP --> M
  VC --> M
  SN --> M
  M -->|RawEvent| D[Detector<br/>signature + 24h baseline]
  D -->|Incident| IC{IncidentCommander}
  IC -->|spawns| SA1[log-analyst]
  IC -->|spawns| SA2[code-detective]
  IC -->|spawns| SA3[git-historian]
  SA1 & SA2 & SA3 -->|tool calls| SB[Docker sandbox<br/>FastAPI tool server]
  SB -->|ToolResult| IC
  IC -->|finish_incident| R[Structured report<br/>Draft PR]

  style IC fill:#fb923c20,stroke:#fb923c
  style R fill:#a78bfa20,stroke:#a78bfa
  style SB fill:#12121a,stroke:#27272a
Stage What it does Code
1. Monitor Emits a RawEvent per log line / HTTP probe / Vercel or Sentry payload quell/monitors/
2. Detector Fingerprints events (normalised 16-char hex); fires on new / spike / high-severity quell/detector/
3. Commander Root IncidentCommander reads logs, greps code, reasons, optionally spawns specialist subagents quell/agents/
4. Sandbox Every code-touching tool runs inside a Docker container with your workspace mounted read-only quell/runtime/ · quell/tool_server/
5. Report Structured {root_cause, evidence, proposed_fix, status}, wraps a draft PR for human review quell/tools/reporting/
6. Persist One AgentRun row per investigation plus per-iteration Event and structured Finding rows for the dashboard + replay quell/memory/
7. Notify Fans the result out to Slack / Discord / Telegram in parallel quell/notifiers/

Features

Sandboxed by default

Every tool that touches code runs inside Docker with your workspace mounted read-only. Per-sandbox bearer-token auth on the FastAPI tool server.

docs/architecture.md

Multi-agent coordination

IncidentCommander spawns specialist subagents through an AgentGraph; they exchange messages through an asyncio.Queue broker. Parallel investigations, sequential reasoning.

docs/architecture.md

Bring your own model

Built on LiteLLM — OpenAI, Anthropic, Google Gemini, Ollama, or any custom endpoint. One line of TOML to switch. No lock-in.

docs/configuration.md

Skill runbooks

Markdown + YAML frontmatter runbooks get auto-injected into the agent's system prompt when their triggers match an incident. Nineteen come bundled — Stripe, OpenAI, null-deref, DNS, SSL, memory, disk, deadlock, Django/Flask/Rails/Spring/Express, Postgres, Redis, Docker, Kubernetes.

docs/extending.md

Draft PRs, never auto-merge

Every proposed fix is a draft PR. Humans review, humans merge. No silent changes, no 3am surprises, no trust-me-bro commits.

SECURITY.md

No telemetry by default

Your code, your logs, your infrastructure. Nothing leaves your machine unless you explicitly configure a remote monitor or LLM endpoint. Telemetry is opt-in only.

docs/configuration.md

Notify your team

Slack, Discord, and Telegram channels fan out in parallel once an investigation completes. Configure once in TOML, verify with quell test-notifier <channel>, never mix transient errors across channels.

docs/configuration.md

Web dashboard + replay

quell dashboard boots a local Next.js + FastAPI UI for incidents, runs, findings, and aggregate stats. quell replay <id> prints the same event stream as a terminal timeline. Read-only, no Cloud required.

docs/commands.md

Cost tracking + budgets

Per-model rate card across Anthropic / OpenAI / Google / Ollama. Every run records input + output tokens and a USD estimate; max_cost_usd in .quell/config.toml halts a runaway investigation before it lights money on fire.

docs/configuration.md


What is in the box

11 built-in tools

Category Tool What it does
Code code_read Read a file (optionally with line range)
Code code_grep ripgrep-backed content search with path-traversal guard
Git git_log Recent commits with author and timestamp
Git git_blame Line-level authorship
Git git_diff Diff between refs or working tree
Monitoring http_probe Hit an HTTP endpoint, return status + headers + body
Monitoring logs_query Tail a local log with substring filter
Reporting create_incident_report Structured incident summary
Reporting create_postmortem Blameless postmortem in Markdown
Coordination agent_finish Subagent signals completion
Coordination finish_incident Root agent closes the investigation

Plus four inter-agent tools (create_agent, send_message, wait_for_message, view_graph) added in Phase 13.

19 bundled skill runbooks

Slug Category Severity Triggers when
stripe-webhook-timeout incidents high error mentions stripe-signature or webhook timeout
unhandled-null incidents medium error mentions NoneType, null is not an object, Cannot read propert
openai-rate-limit incidents high error mentions rate_limit_exceeded, 429, tokens per minute
dns-resolution-failure incidents high error mentions EAI_AGAIN, getaddrinfo, unknown host
ssl-certificate-expired incidents high error mentions certificate expired, CERT_HAS_EXPIRED
memory-leak incidents high RSS climbs without GC drop, OOMKilled, heap snapshots diverge
disk-full incidents critical error mentions ENOSPC, no space left on device, write failures
database-deadlock incidents high error mentions deadlock detected, lock wait timeout
fastapi / nextjs-app-router frameworks medium framework matches or stack trace fingerprints
django / flask / rails / spring-boot / express frameworks medium framework matches or stack trace fingerprints
postgres / redis technologies high tech stack matches, error mentions known signals
docker / kubernetes technologies high container/pod-level failures, OOM, CrashLoopBackOff

Add your own by dropping a .md file in quell/skills/<category>/. See docs/extending.md.


Architecture

graph TB
  subgraph host["Host (your dev machine or on-call server)"]
    CFG[("Config<br/>TOML + Pydantic")]
    MEM[("Memory<br/>AgentRun · Event · Finding")]
    MON[Monitors] --> DET[Detector]
    DET --> CMD[IncidentCommander]
    CMD <--> LLM[LiteLLM<br/>cost + budgets]
    CMD <--> SKL[(19 Skills)]
    CMD -->|spawn| SUB[Subagents]
    SUB --> CMD
    CMD -->|fan-out| NOT[Slack · Discord · Telegram]
    CFG -.-> CMD
    CMD -.-> MEM
    MEM -.-> DASH[Dashboard + replay]
  end

  subgraph sandbox["Docker sandbox (per-agent)"]
    TS[FastAPI tool server]
    TOOLS[11 built-in tools]
    WS[/workspace/<br/>mounted read-only/]
    TS --> TOOLS
    TOOLS --> WS
  end

  CMD <-->|bearer-token HTTP| TS

  style CMD fill:#fb923c20,stroke:#fb923c,color:#fafafa
  style SUB fill:#fb923c15,stroke:#fb923c80,color:#fafafa
  style NOT fill:#a78bfa15,stroke:#a78bfa80,color:#fafafa
  style DASH fill:#a78bfa15,stroke:#a78bfa80,color:#fafafa
  style sandbox fill:#12121a,stroke:#a78bfa,color:#fafafa
  style host fill:#0a0a0f,stroke:#27272a,color:#fafafa

Eleven subsystems, every boundary typed.

Subsystem Lines of Python Test coverage
quell/config/ ~400 24 tests
quell/memory/ ~770 40 tests
quell/monitors/ ~600 24 tests
quell/llm/ ~530 41 tests
quell/tools/ ~700 42 tests
quell/agents/ ~1 100 33 tests
quell/skills/ ~360 30 tests
quell/runtime/ + quell/tool_server/ ~400 16 tests
quell/notifiers/ ~410 20 tests
quell/dashboard/ + quell/replay/ ~570 16 tests
Total ~5 800 LoC 302 tests

Deep dive in docs/architecture.md.


Documentation

Start here

Reference

Build on it


FAQ

Does Quell send my code to an LLM provider?

Only the fragments the agent explicitly reads through its tools, and those are gated by the sandbox, which only sees whatever is in the workspace you mount. You can swap LiteLLM to a local model like Ollama if you do not want any code leaving your machine at all.

Will Quell modify my code?

No. Every tool that touches the filesystem runs inside a Docker sandbox with the workspace mounted read-only. Fixes are proposed as draft PRs only, never pushed, never merged, and require a human to approve.

What LLMs are supported?

Anything LiteLLM supports. OpenAI (GPT-4o, GPT-5), Anthropic (Claude Haiku / Sonnet / Opus), Google Gemini, Ollama local models, any OpenAI-compatible endpoint. Swap via one line of TOML.

How expensive is a typical investigation?

A typical incident runs 3–7 agent iterations and consumes 15–40k input tokens plus 1–3k output tokens. On claude-haiku-4-5 that is roughly $0.01–0.03 per incident. As of v0.2, every run records its own token + USD cost and a hard max_cost_usd cap in .quell/config.toml halts a runaway investigation before it lights money on fire. quell stats shows the rolling per-incident total.

Does it work without Docker?

The unit tests and the "dry run" walkthrough work without Docker. Real production investigations of untrusted code should run under Docker for the read-only workspace and network isolation guarantees.

Is it production-ready?

Quell is v0.2.0 alpha. Core flow works end-to-end (302 tests), v0.1.x configs are forward-compatible, and the new persistence, notifier, and dashboard layers are in active use. Expect rough edges around non-English logs, long stack traces, and rare LLM failure modes. Run Quell against staging first. File issues and we respond fast.

Can I self-host the dashboard?

Yes — quell dashboard boots a read-only Next.js + FastAPI UI on http://127.0.0.1:7777 with incident list, run timelines, findings, and aggregate stats. The compiled SPA ships inside the Python wheel, so no separate Node runtime is needed at install time. Bind it to a different host with --host 0.0.0.0 if you want it on a shared on-call box.

Where do alerts go?

quell.notifiers ships Slack, Discord, and Telegram channels. Add one (or all) under [[notifiers]] in .quell/config.toml, run quell test-notifier slack (or discord, telegram) to verify the wiring, then quell watch will fan a structured incident summary out to every configured channel in parallel as soon as the agent finishes.

How does Quell compare to existing tools?

Quell is not a monitoring tool (Sentry / Datadog already do that). Not a chatbot (Quell is autonomous, not interactive). Not an auto-merger (humans always review). It sits on top of your existing monitoring: Quell consumes Sentry / Vercel / log events, investigates the underlying cause, and produces a report.


Community

Contribute

Read CONTRIBUTING.md for the dev loop and the stop-gate (ruff format, ruff check, mypy strict, pytest — all four must pass before a merge).

Good first issues are labelled on GitHub. Drop into Discussions if you are not sure where to start.

Report a bug or vulnerability


Roadmap

Quell is built across 22 phases documented in BUILD_PLAN.md.

  • v0.1 — phases 1–16 (shipped). Config, memory, monitors, LLM, tools, agents, skills, detector, Docker runtime, tool server, built-in tools, agent graph, end-to-end integration, polish, public launch.
  • v0.2 — phases 17–22 (shipped). Slack / Discord / Telegram notifiers, expanded 19-skill library, AgentRun + Event + Finding persistence, per-model cost tracking with max_cost_usd budgets, local web dashboard, terminal quell replay.
  • v0.3 (next). Multi-repo coordination, cross-incident learning, richer dashboard filters.
  • v1.0 (aspirational). Production-ready, typed per-incident cost budgets, hosted Cloud option (opt-in only — CLI stays self-hostable forever).

Development

# One-time editable install with dev deps
curl -fsSL https://raw.githubusercontent.com/bhartiyaanshul/quell/main/install.sh | bash -s -- --dev

# Stop-gate — all four must pass before merging
poetry run ruff format quell/ tests/ --check
poetry run ruff check  quell/ tests/
poetry run mypy        quell/
poetry run pytest      tests/ -q

Full dev loop in CONTRIBUTING.md.

Landing page

The marketing site at quell.anshulbuilds.xyz lives in landing/.

cd landing
npm install
npm run dev      # http://localhost:3000 with hot reload
npm run build    # produces ./out/ — deploy anywhere static

Next.js 14 + TailwindCSS + Framer Motion. See landing/README.md for the component map.


Credits

Built by Anshul Bhartiya@bhartiyaanshul

Apache 2.0 · Open source · No telemetry · Your code stays on your machine.

Apache 2.0 GitHub stars GitHub forks

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

quell-0.2.1.tar.gz (707.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

quell-0.2.1-py3-none-any.whl (782.9 kB view details)

Uploaded Python 3

File details

Details for the file quell-0.2.1.tar.gz.

File metadata

  • Download URL: quell-0.2.1.tar.gz
  • Upload date:
  • Size: 707.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for quell-0.2.1.tar.gz
Algorithm Hash digest
SHA256 deea2225e49dc71b4eecc1519c83923c90b638af6eb9d0a8fe70fb8422615bd0
MD5 56dfad23edf7726fe68e08cbfff55837
BLAKE2b-256 a7b2e2009c7a961f980bb14e68cbb5b0c9d01bb2c9d05fd717f16063f6ef2724

See more details on using hashes here.

Provenance

The following attestation bundles were made for quell-0.2.1.tar.gz:

Publisher: release.yml on bhartiyaanshul/quell

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file quell-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: quell-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 782.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for quell-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 77db4bd83150d3e5cfd0797e0f516ee692fcec06ce3c67a68c04c0405f639386
MD5 e1fba61c0e88b926fc743dd43be07039
BLAKE2b-256 90b68ebae90a44e203b6b7899bd072144cb4235654599ea052e11124273abaa2

See more details on using hashes here.

Provenance

The following attestation bundles were made for quell-0.2.1-py3-none-any.whl:

Publisher: release.yml on bhartiyaanshul/quell

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page