Explain it like you'd tell grandma.
Project description
Your AI agent just wrote 2,000 words. You need the next step. grandma extracts it.
Pipe any LLM output through grandma and get a clean terminal card: what happened, the bottom line, what to do next. No API key needed if you use Claude Code.
pipx install grandma
grandma --demo
Try it in 30 seconds
# zero-config if you have Claude Code installed
echo "$(cat some-agent-output.txt)" | grandma
# local model via Ollama (no cloud key needed)
GRANDMA_MODEL_BACKEND=ollama GRANDMA_MODEL=llama3.1 grandma --demo
# any provider
GRANDMA_MODEL_BACKEND=openai GRANDMA_MODEL=gpt-4o-mini OPENAI_API_KEY=sk-... grandma --demo
Before / After
Raw agent output (~90 words, ~120 tokens):
I inspected the repository and found that the authentication flow now routes through
the new async session adapter. I updated three files, added one regression test, and
confirmed that the login path still returns the expected token shape. There is one
compatibility consideration: the adapter relies on asyncio.TaskGroup, so Python 3.10+
is required. Overall, this should reduce request latency by ~70%, but the deployment
notes should mention the runtime floor bump.
After grandma (default mode โ ~30 words):
๐ต grandma
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
๐ Second turn reviewing an auth refactor PR.
What happened: Auth moved to the async session adapter.
Bottom line: Faster login path, but Python โฅ3.10 required.
Do next: Update deployment docs before shipping.
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
After grandma (deep mode โ full impact table):
๐ต grandma
โโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ ๐ story arc โ Second turn reviewing an auth refactor PR. โ
โ ๐ happened โ Auth moved to async session adapter. โ
โ ๐งถ changed โ 3 files, 1 regression test added. โ
โ โ
positive โ Login latency reduced ~70%; token shape OK. โ
โ โ ๏ธ negative โ Requires Python โฅ3.10 (asyncio.TaskGroup). โ
โ โ neutral โ API surface unchanged. โ
โ ๐ก net gain โ Win โ ship it with a runtime floor note. โ
โ ๐ actions โ - Bump requires-python to >=3.10 โ
โ โ - Update deployment docs โ
โโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Install
pip install grandma # pip
pipx install grandma # isolated (recommended)
uv tool install grandma # uv
pip install git+https://github.com/ypollak2/Grandma.git # from source
Requires Python โฅ 3.10.
Usage
echo "agent output..." | grandma # default 3-line card
cat output.txt | grandma --mode deep # full impact table
cat output.txt | grandma --json # raw JSON verdict
cat output.txt | grandma --mode off # passthrough (for hooks)
grandma --demo # try it with no input
grandma --demo --mode deep
export GRANDMA_MODE=deep # set default for session
The modes
| Mode | Output | Best for |
|---|---|---|
default |
3-line card: happened / bottom line / do next | Most agent output |
deep |
Full impact table with positive/negative/neutral | PRs, arch decisions, refactors |
off |
Passthrough โ original text unchanged | Hook pass-through control |
๐ธ Why this matters: token savings
Every intermediate agent response fed into the next prompt costs tokens. Grandma compresses the signal.
| Words | Tokens (approx) | |
|---|---|---|
| Raw LLM response | ~400 | ~530 |
| grandma card | ~40 | ~55 |
| Saving | ~90% | ~90% |
Across a 20-turn session: 8,000โ12,000 context tokens saved.
Model backends โ bring your own
Grandma auto-detects the best available backend. No config needed to get started with Claude Code.
| Priority | When | Backend |
|---|---|---|
| 1 | GRANDMA_MODEL_BACKEND set |
Explicit provider |
| 2 | GRANDMA_MODEL / GRANDMA_API_KEY / GRANDMA_BASE_URL set |
OpenAI-compatible |
| 3 | OPENAI_API_KEY set |
OpenAI |
| 4 | GROQ_API_KEY set |
Groq |
| 5 | GEMINI_API_KEY / GOOGLE_API_KEY set |
Gemini |
| 6 | ANTHROPIC_API_KEY set |
Anthropic SDK |
| 7 | (nothing set) | claude -p - (Claude Code subscription, no key) |
Model names are fully dynamic โ no hardcoded vendor strings in grandma. You choose the model; grandma uses whatever you set. For backends that don't require an explicit model (claude_cli, anthropic SDK), grandma lets the provider pick its own default.
Copy .env.example โ .env and uncomment your provider:
cp .env.example .env
Ollama (local, no key):
GRANDMA_MODEL_BACKEND=ollama
GRANDMA_MODEL=llama3.1
GRANDMA_DEEP_MODEL=deepseek-r1
Groq (fast inference):
GRANDMA_MODEL_BACKEND=groq
GRANDMA_MODEL=llama-3.1-8b-instant
GRANDMA_DEEP_MODEL=llama-3.3-70b-versatile
GROQ_API_KEY=gsk_...
OpenAI:
GRANDMA_MODEL_BACKEND=openai
GRANDMA_MODEL=gpt-4o-mini
GRANDMA_DEEP_MODEL=gpt-4.1
OPENAI_API_KEY=sk-...
Gemini:
GRANDMA_MODEL_BACKEND=gemini
GRANDMA_MODEL=gemini-2.5-flash
GRANDMA_DEEP_MODEL=gemini-2.5-pro
GEMINI_API_KEY=AIza...
Any OpenAI-compatible provider (Together, Fireworks, LM Studio, etc.):
GRANDMA_MODEL_BACKEND=openai_compatible
GRANDMA_BASE_URL=https://your-provider.example.com/v1
GRANDMA_API_KEY=your-key
GRANDMA_MODEL=your-model-name
Custom subprocess (anything that reads stdin):
GRANDMA_MODEL_BACKEND=custom_command
GRANDMA_MODEL_COMMAND=ollama run llama3.1
IDE & agent integrations
Auto-install all detected tools:
./install.sh
| Tool | How | Effect |
|---|---|---|
| Claude Code | Stop hook | Auto-card after every long response |
| Codex CLI | Stop hook | Auto-card after every long response |
| Gemini CLI | AfterAgent hook | Auto-card after every agent turn |
| Cursor | MCP + rules | Agent calls grandma_summarize automatically |
| Cline | MCP + rules | Agent calls grandma_summarize after tasks |
| Continue | MCP + slash command | On demand or automatic |
| Windsurf | MCP + rules | Agent calls grandma_summarize automatically |
| Zed | MCP (context_servers) |
Tool in Zed AI panel |
| Goose | MCP extension | Tool in Goose sessions |
| Aider | pipe wrapper | Pipe aider output through grandma |
| OpenHands | pipe wrapper | Pipe headless output through grandma |
MCP server โ works with any MCP-capable IDE:
grandma serve
Add to .mcp.json:
{
"mcpServers": {
"grandma": { "command": "grandma", "args": ["serve"] }
}
}
Tools exposed: grandma_summarize(text, mode, story_context) and grandma_summarize_json(...).
Troubleshooting
"No model configured for backend X"
Set GRANDMA_MODEL=<model-name> in your .env. See .env.example for provider examples.
"claude CLI not found"
Install Claude Code or set a different backend via GRANDMA_MODEL_BACKEND.
Output is not JSON / parsing error
The model returned something unexpected. Try --mode off to see the raw response, then check your model/backend config.
Provider returns 401 / 403
Check that the right API key env var is set for your backend (e.g. OPENAI_API_KEY, GROQ_API_KEY).
How the mascot rotates
Each portrait in assets/grandmas/ was generated with Gemini Image.
A daily GitHub Action picks one at random and replaces assets/grandma.png with a [skip ci] commit.
Want to add your own grandma? Drop a PNG into assets/grandmas/ and open a PR.
Contributing
See CONTRIBUTING.md for full details. Quick start:
git clone https://github.com/ypollak2/Grandma.git
cd Grandma
pip install -e ".[dev]"
pytest tests/ # run tests
ruff check src # lint
Good first issues are tagged good first issue.
FAQ
Does grandma make it dumber? No dear. She keeps the facts. She removes the fog machine.
What model does it use?
Whatever you configure via GRANDMA_MODEL. There are no hardcoded model names. If you set nothing, claude_cli lets Claude Code pick, and anthropic lets the SDK pick its default. API-based backends (openai, groq, ollama, gemini) require GRANDMA_MODEL to be set.
Why is there an off mode?
Because hooks need a clean passthrough option. When GRANDMA_MODE=off, grandma becomes transparent โ useful for temporarily disabling the Stop hook without unregistering it.
What's the story_so_far line?
Grandma reads the last 3 turns of the conversation to track where you are in the arc. Turn 8 of the same auth bug? She'll say so. This is the dementia prevention feature.
Why Python โฅ 3.10?
The mcp package (used for grandma serve) requires 3.10+.
How is this different from llm / fabric / shell_gpt?
llm gives you model access. fabric gives you prompt patterns. shell_gpt gives you a REPL. grandma does none of those things. It sits at the output end โ it is the digest layer you add after any of those tools to stop reading 2,000-word agent responses.
Changelog
See CHANGELOG.md.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file grandma-0.2.2.tar.gz.
File metadata
- Download URL: grandma-0.2.2.tar.gz
- Upload date:
- Size: 42.5 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
efeef483496c136e5e31b9c38bcdafc68814afe8f38394c53326a309af02f333
|
|
| MD5 |
a6ad0727dff489f78f778d80b55a5a57
|
|
| BLAKE2b-256 |
1fab44d6269c95922075ea721c0e8f5a7151b13f6808f654506432f0532a5e98
|
File details
Details for the file grandma-0.2.2-py3-none-any.whl.
File metadata
- Download URL: grandma-0.2.2-py3-none-any.whl
- Upload date:
- Size: 22.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
626ffde0b0415b3e631ad6a043f8469d956c0431881a713b56d0f376fd3cc149
|
|
| MD5 |
e9a4ae76a118120ff9a0a597c0a0372f
|
|
| BLAKE2b-256 |
4ab581284892664c7bd1fc8d1abec58adaaba8fffbacb97ea2948337e3c0bc33
|