Maintain a local, LLM-queryable corpus of an IETF Working Group's public record (drafts, mailing list, GitHub issues, meetings), with an MCP server, semantic search, and NotebookLM export.
Project description
ietf-llm
Maintain a local, queryable corpus of an IETF Working Group's public record — charter, drafts, RFCs, meeting agendas, minutes, slides, transcripts, mailing list archives, and GitHub issues — for use with LLM-based tools.
Note: This package was previously published as
ietf-notebook. That distribution is deprecated. See Migrating fromietf-notebook.
What it's for
A working group's history is spread across mailing list archives, Datatracker, GitHub, and meeting materials — too much to hold in your head, and too scattered to search well by hand. With the record gathered into one queryable corpus, an LLM can help you:
- Get up to date with the state of discussions — what's open, what was recently decided, where a debate currently stands.
- Summarise the arguments already made about an issue — every distinct position on a topic, who holds it, and how the chairs ruled.
- Formulate a new proposal — surface the objections raised against similar ideas before, so you can anticipate them.
- Fact-check assertions about what's happened so far — grounded in the actual list traffic and chair statements, not someone's recollection.
Two supported workflows:
- Use it as an MCP server — register
ietf-llm-mcpwith Claude, Codex, Gemini, Cursor, Zed, etc. and ask questions across any WG you've gathered. - Use it with NotebookLM — export the gathered corpus as a directory of clean text files (or push directly to NotebookLM Enterprise) and ingest it as a notebook source set.
Also works with IRTF Research Groups. Pass the RG's shortname (e.g.
cfrg,hrpc,pearg) anywhere this README says<wg>.
Table of contents
- What it's for
- Installation
- 1. Use as an MCP server
- 2. Use with NotebookLM
- Reference
- Migrating from
ietf-notebook
Installation
pipx install ietf-llm
Behind a corporate firewall with TLS interception? Install with the
certs extra:
pipx install ietf-llm[certs]
Shell completion
Optional. Add the line for your shell to its rc file to tab-complete commands, flags, and cached WG names:
# bash — in ~/.bashrc
eval "$(ietf-llm --completion bash)"
# zsh — in ~/.zshrc
eval "$(ietf-llm --completion zsh)"
# fish — in ~/.config/fish/config.fish
ietf-llm --completion fish | source
1. Use as an MCP server
ietf-llm-mcp is a stdio Model Context Protocol
server that exposes the local corpus to any MCP-capable agent. Set up
once, gather each WG you care about once, then ask questions
indefinitely.
Register the server
Pick your client. The snippets below are correct as of writing — if your client has changed since, its own MCP docs are authoritative.
Gotcha (all clients): if ietf-llm-mcp was installed via pipx,
the binary is on your shell PATH but may not be on the PATH
inherited by a GUI app launched from Finder / Spotlight / Explorer.
Use the absolute path (which ietf-llm-mcp) if the client can't find
the command.
Claude Code
claude mcp add ietf-llm -- ietf-llm-mcp
Also install the bundled skill so Claude knows how to drive the tools well (digests before raw reads, search before slurping mailing-list files, etc.):
ietf-llm --install-claude-skill
Re-run after upgrading the package to pick up improvements.
Claude Desktop
Edit claude_desktop_config.json (create it if missing):
- macOS:
~/Library/Application Support/Claude/claude_desktop_config.json - Windows:
%APPDATA%\Claude\claude_desktop_config.json - Linux:
~/.config/Claude/claude_desktop_config.json
{
"mcpServers": {
"ietf-llm": {
"command": "ietf-llm-mcp"
}
}
}
Quit and relaunch Claude Desktop — the config is only read at startup.
Codex CLI (OpenAI)
~/.codex/config.toml:
[mcp_servers.ietf-llm]
command = "ietf-llm-mcp"
Gemini CLI
~/.gemini/settings.json:
{
"mcpServers": {
"ietf-llm": {
"command": "ietf-llm-mcp"
}
}
}
opencode
~/.config/opencode/opencode.json (or opencode.json in your project
root):
{
"$schema": "https://opencode.ai/config.json",
"mcp": {
"ietf-llm": {
"type": "local",
"command": ["ietf-llm-mcp"],
"enabled": true
}
}
}
Cursor
In-app MCP settings panel, or ~/.cursor/mcp.json (global) or
.cursor/mcp.json (per-project):
{
"mcpServers": {
"ietf-llm": {
"command": "ietf-llm-mcp"
}
}
}
Zed
~/.config/zed/settings.json:
{
"context_servers": {
"ietf-llm": {
"command": {
"path": "ietf-llm-mcp",
"args": []
},
"settings": {}
}
}
}
Tuning. Each tool call has a server-side deadline so a stuck call
fails fast with a clear message rather than hanging to the client's
timeout. It defaults to 120 seconds; override (or disable, with 0) by
setting IETF_LLM_TOOL_TIMEOUT in the server's environment — e.g. add
"env": {"IETF_LLM_TOOL_TIMEOUT": "180"} to the JSON config above.
Gather a corpus
Gather from the CLI, once per corpus. Settings persist, so refreshing
is a bare re-run (ietf-llm httpbis), and the semantic index updates
incrementally each time.
ietf-llm httpbis --github httpwg/http-core --github httpwg/http-extensions
First run: gathering a corpus also builds a local semantic-search
index, which downloads an embedding model (~130 MB) from
Hugging Face.
Subsequent gathers reuse the cached model. Pass --no-embed to skip
the index (and the download) — useful for
NotebookLM export or offline gathers.
A corpus doesn't have to be a Working Group — the name is classified automatically:
| Command | Corpus |
|---|---|
ietf-llm httpbis |
a WG / RG / editorial WG / BoF: charter, drafts, meetings, ballots, list |
ietf-llm last-call |
a standalone mailing list (any archived at mailarchive.ietf.org — IETF, IRTF, or RFC-Editor) |
ietf-llm rfced --mailing-list rswg@rfc-editor.org |
a named list corpus (the address domain is optional) |
ietf-llm new-ids --new-drafts --months 1 |
new Internet-Drafts in a rolling window |
ietf-llm mnot --author mnot@mnot.net |
every draft a person has authored |
Everything lands in ~/.cache/ietf-llm/<name>/, which the MCP server
reads. See Gather options for the full flag set.
2. Use with NotebookLM
NotebookLM ingests a corpus as a set of source files. ietf-llm-export
turns the gathered cache into an upload-ready directory, or pushes
straight to a NotebookLM Enterprise notebook.
Workflow note: export always produces a complete fresh dump. Create a new notebook on each refresh rather than trying to merge updates into an existing one.
Gather a corpus
Same as the MCP path; add --no-embed to skip the local index
(NotebookLM does its own):
ietf-llm httpbis --no-embed \
--github httpwg/http-core --github httpwg/http-extensions
Export to a local directory
ietf-llm-export httpbis --destination ~/notebooklm/httpbis
Drag the directory's contents into NotebookLM as sources. Per-thread mailing list conversations and per-issue GitHub records are bundled by year / repo to stay under NotebookLM's 50-source free / 300-source Plus limit.
Export to NotebookLM Enterprise
If you have Google Workspace Enterprise with NotebookLM enabled,
ietf-llm-export can create the notebook and upload sources directly:
ietf-llm-export httpbis --create my-gcp-project-id
One-time setup:
- Google Cloud Project with the Discovery Engine API enabled.
- OAuth credentials: create an "OAuth 2.0 Client ID" (Desktop App) in the Cloud Console.
- Save the JSON as
client_secrets.jsonin~/.config/ietf-llm/(or pass--credentials-file PATH).
First run opens a browser to authorise; the token is cached at
~/.config/ietf-llm/token.json.
Per-WG export settings are persisted at
~/.config/ietf-llm/<wg>/export.json — subsequent runs of the same
mode need only ietf-llm-export <wg>.
Reference
Commands
| Command | Job | Reads | Writes |
|---|---|---|---|
ietf-llm |
Gather / refresh a corpus | network | cache |
ietf-llm-export |
Mirror cache to dir, or push to NotebookLM Enterprise | cache | dir / NotebookLM |
ietf-llm-search |
Semantic search over the cache | cache | stdout |
ietf-llm-mcp |
Expose the cache to MCP clients | cache | stdio (MCP) |
All four are independent. The cache (~/.cache/ietf-llm/<wg>/) is
the single source of truth; everything else reads from it.
Gather options
ietf-llm [OPTIONS] <name>
<name> is the corpus to gather, classified automatically:
- a Working Group / Research Group / editorial WG / BoF shortname
(
httpbis,cfrg,rswg) — gathered in full (charter, drafts, meetings, ballots, mailing list); - a mailing list archived at mailarchive.ietf.org — IETF, IRTF,
or RFC-Editor (
last-call,irtf-discuss,rfc-interest) — that list on its own; - any other label given explicit sources (
--draft/--mailing-list/--github/--new-drafts/--author); - prefix with
x-to skip the Datatracker group lookup entirely (a fully manual corpus).
A name that is none of these and has no configured sources is rejected as a likely typo.
Sources (what to gather; all repeatable / persisted):
--github OWNER/REPO— a GitHub repo whose issues to include.--draft DRAFT-NAME— an extra Internet-Draft to track, beyond a WG's own documents. Version suffix stripped; every revision gathered.--mailing-list LIST— an extra list to sync (any archived at mailarchive.ietf.org). A bare name or a full address; the domain is optional and ignored (rswg,rswg@rfc-editor.org).--new-drafts— subscribe to new Internet-Drafts: every-00submitted within--months(rolling window; drafts age out).--author PERSON— every draftPERSONauthored.PERSONis an email (mnot@mnot.net, recommended), a Datatracker person id, or an exact full name. Drafts only.--add-mentioned-drafts— also pull drafts the corpus's threads/issues mention but don't already include. Sticky.
Scope & filtering:
--months N— months of mailing list / meeting / new-draft history (default 12).--github-label LABEL/--exclude-github-label LABEL— include / exclude issues by label.
Digests & search index:
--summarize/--summarize-model MODEL— add LLM-generated one-liners to digests via thellmpackage.--no-embed— skip the semantic search index (it backsietf-llm-searchand the MCPsearch_corpustool). On by default, incremental.--embed-model MODEL— embedding model id (default: a small local model).--rebuild-embeddings— drop and re-embed everything instead of the incremental update.
Cache & config:
--list— list cached corpora (name, kind, status, last-gathered, and a one-line subject — the group name, list, or tracked author), then exit.--clear-cache— wipe this corpus's cache and re-download.--clear-config— clear this corpus's persisted config.--quiet/--verbose.
Per-corpus settings are persisted at
~/.config/ietf-llm/<name>/gather.json.
GitHub auth. Set GITHUB_TOKEN on the gather invocation (a fine-
scoped read-only token is plenty); without one you'll hit anonymous
API rate limits quickly on large WGs. Prefer inline-passing over
exporting in your shell rc so the token doesn't leak into every other
subprocess:
GITHUB_TOKEN=ghp_... ietf-llm httpbis
# or, from a secret manager:
GITHUB_TOKEN=$(security find-generic-password -s github-readonly -w) \
ietf-llm httpbis
Semantic search from the CLI
ietf-llm-search httpbis "skepticism about cookie partitioning" -k 8
Chunks are content-aware: one chunk per mailing list message, one per
issue comment, and a windowed slice of drafts/RFCs/transcripts. The
index lives at ~/.cache/ietf-llm/<wg>/embeddings.db and updates
incrementally on each gather.
Default model: sentence-transformers/BAAI/bge-small-en-v1.5 —
~130 MB on disk (~33M params), MPS-accelerated, runs entirely on your
machine. Downloaded from
Hugging Face on first
use and cached. Skip with --no-embed at gather time. Override with
--embed-model <id> for any model the llm package recognises.
Migrating from ietf-notebook
If you previously used the ietf-notebook distribution:
pipx uninstall ietf-notebook
pipx install ietf-llm
Cache and config directories changed names. To preserve a gathered cache, move it by hand:
mv ~/.cache/ietf-notebook ~/.cache/ietf-llm
mv ~/.config/ietf-notebook ~/.config/ietf-llm
Otherwise the old directories are simply ignored.
Command renames
| Before | After |
|---|---|
ietf-notebook <wg> |
ietf-llm <wg> |
| (no equivalent) | ietf-llm-export <wg> (split out) |
| (no equivalent) | ietf-llm-search <wg> <query> (new) |
| (no equivalent) | ietf-llm-mcp (new) |
Flags moved off the gather CLI
These now live on ietf-llm-export:
Old: ietf-notebook <wg> ... |
New |
|---|---|
--destination DIR |
ietf-llm-export <wg> --destination DIR |
--create GCP_PROJECT |
ietf-llm-export <wg> --create GCP_PROJECT |
--credentials-file PATH |
ietf-llm-export <wg> --credentials-file PATH |
--token-file PATH |
ietf-llm-export <wg> --token-file PATH |
If you pass any of these to ietf-llm, you'll get a redirect error.
--update is gone
The gather CLI is now idempotent — re-run it whenever you want fresh data. The export CLI always produces a complete fresh dump; for NotebookLM, create a new notebook each refresh rather than trying to merge updates.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ietf_llm-0.7.1.tar.gz.
File metadata
- Download URL: ietf_llm-0.7.1.tar.gz
- Upload date:
- Size: 318.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
348610c82f388a34123aa6cb030bf5fdc26ba30c3c60e74f2dc0d1a6eb231e8e
|
|
| MD5 |
e4065021aff7aefbc2e9a7757f4f5def
|
|
| BLAKE2b-256 |
2f6d2976cab18e760bf5ff5f06f092dacec50adfb02bf88ebd048216ffff3dda
|
Provenance
The following attestation bundles were made for ietf_llm-0.7.1.tar.gz:
Publisher:
publish.yml on mnot/ietf-llm
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ietf_llm-0.7.1.tar.gz -
Subject digest:
348610c82f388a34123aa6cb030bf5fdc26ba30c3c60e74f2dc0d1a6eb231e8e - Sigstore transparency entry: 1676748095
- Sigstore integration time:
-
Permalink:
mnot/ietf-llm@1bc09a367223f37faa1fe411ed8db8302fca3043 -
Branch / Tag:
refs/tags/v0.7.1 - Owner: https://github.com/mnot
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@1bc09a367223f37faa1fe411ed8db8302fca3043 -
Trigger Event:
push
-
Statement type:
File details
Details for the file ietf_llm-0.7.1-py3-none-any.whl.
File metadata
- Download URL: ietf_llm-0.7.1-py3-none-any.whl
- Upload date:
- Size: 251.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f2b0d8c2fd7040bedb604fe6fe541879cc61a3cf44cfae8b88b3ce7e9ee793c7
|
|
| MD5 |
42ee59b8c100f4273706f4a7f94e1800
|
|
| BLAKE2b-256 |
a8a1f1fb44794af9d357d3b1de78debe537773d7478cb3b95d3b44636547ed72
|
Provenance
The following attestation bundles were made for ietf_llm-0.7.1-py3-none-any.whl:
Publisher:
publish.yml on mnot/ietf-llm
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ietf_llm-0.7.1-py3-none-any.whl -
Subject digest:
f2b0d8c2fd7040bedb604fe6fe541879cc61a3cf44cfae8b88b3ce7e9ee793c7 - Sigstore transparency entry: 1676748103
- Sigstore integration time:
-
Permalink:
mnot/ietf-llm@1bc09a367223f37faa1fe411ed8db8302fca3043 -
Branch / Tag:
refs/tags/v0.7.1 - Owner: https://github.com/mnot
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@1bc09a367223f37faa1fe411ed8db8302fca3043 -
Trigger Event:
push
-
Statement type: