Test your package through the eyes of a newbie agent — a fresh AI agent reads only your docs/skills and tries to use your package.
Project description
newb
Test your package through the eyes of a newbie agent — a fresh AI agent reads only your docs and tries to use your package. If it succeeds, your docs work.
Full Documentation · pip install newb
Python 3.10+ · bundles
claude-agent-sdk(Anthropic, MIT) · newb itself AGPL-3.0-only · auth:NEWB_ANTHROPIC_API_KEYor local~/.claude/OAuth
Problem and Solution
| # | Problem | Solution |
|---|---|---|
| 1 | What a package is for and how it works isn't obvious. Authors know their own surface; readers don't. | newb asks four canonical questions automatically — what for, problems solved, quick start, when not to use — and reports back what a fresh reader actually understood. |
| 2 | In this era, the first-class reader of a package is an AI agent, not a human scrolling through README hash-anchors. Docs that read well to humans can still be unusable to agents. | newb tests docs through the actual reader: a fresh claude-agent-sdk session with setting_sources=[], allowed_tools=["Read"], cwd=<staged copy> — no host CLAUDE.md, no Bash, no Write. |
| 3 | Learning a new package is hard for users. No quick start, missing edge cases, undocumented "when not to use" — all silent failures. | A failing newb run names exactly which question the docs couldn't answer, with the agent's own response — surfacing gaps before users hit them. |
| 4 | Maintaining doc quality across many packages doesn't scale. Manual review per release, per package, per branch is the bottleneck for ecosystem-wide quality. | One CLI per package; JSON output for CI; runs in isolation (host / docker / apptainer); pluggable graders (substring + LLM judge) via tests_newb.yaml. Plug into a CI matrix and quality scales with your portfolio. |
How it works
HOST DOCKER CONTAINER (ghcr.io/.../newb-runner)
┌──────────────────────────────────┐ ┌──────────────────────────────────────────────┐
│ Your project root │ │ /work/project (rw bind-mount) │
│ (auto-detected — dir with │ │ ├── README.md, src/, tests/, examples/ │
│ .git / pyproject.toml / │ │ ├── _skills/<pkg>/ ← prompt focus │
│ setup.py / package.json / │ │ └── tests_newb.yaml (optional) │
│ Cargo.toml / go.mod) │ │ │
│ │ docker run --rm │ claude-agent-sdk (Anthropic, MIT) │
│ ├── stage to │ --network bridge │ ClaudeAgentOptions( │
│ │ /tmp/newb-stage-XXX/ │ -v <staged>:rw │ cwd="/work/project", │
│ │ project/ (rw — agent │ -e ANTHROPIC_API… │ allowed_tools=["Read","Write","Edit", │
│ │ needs to pip install) │ -e NEWB_MODEL │ "Bash","Glob","Grep"], │
│ │ │ -e NEWB_SKILLS_PATH │ permission_mode="acceptEdits", │
│ ├── filter via │ ────────────────────► │ setting_sources=[], # no host CLAUDE │
│ │ `git ls-files --cached │ │ max_turns=15, │
│ │ --others │ │ ) │
│ │ --exclude-standard` │ │ │
│ │ (or hardcoded ignore │ stdout = answer │ agent can ACTUALLY try the package: │
│ │ list for non-git dirs; │ ◄──────────────────── │ pip install -e . │
│ │ broken symlinks dropped) │ │ python -c "import <pkg>" │
│ │ │ │ <pkg> --help │
│ └── one prompt per question │ │ write a small example, run a test │
│ from the chosen template │ │ Returns ResultMessage.result per query. │
│ + one per tests_newb.yaml │ │ │
│ (questions sent in fresh │ │ │
│ sessions — no shared │ │ │
│ conversation state) │ │ │
└──────────────────────────────────┘ └──────────────────────────────────────────────┘
│
▼
┌────────────────────────────────────┐
│ Report │
│ package, template │
│ what_for, problems_solved, │
│ quick_start, when_not_to_use, │
│ post_install_check, │
│ prompt_injection_check │
│ tests[] (substring + LLM judge) │
│ tests_summary │
└────────────────────────────────────┘
Three layers, one responsibility each: container = isolation, SDK
options = agent behavior, agent = exploration. newb owns the test
schema (canonical questions + tests_newb.yaml + graders + report
rendering); the SDK owns everything else: session lifecycle,
transport, message structuring, tool execution. Runtime details and
backend comparison live in Isolation runtimes below.
Installation
pip install newb # core (CLI + Python API)
pip install newb[yaml] # + custom YAML templates / tests_newb.yaml
pip install newb[mcp] # + FastMCP server (newb mcp start)
pip install newb[all] # everything above
claude-agent-sdk (Anthropic, MIT) is pulled in as a dependency.
Auth — NEWB_-prefixed env vars only (no upstream surprises)
newb owns its own env namespace and never silently inherits the
upstream ANTHROPIC_API_KEY. Two opt-in vars (set whichever you have):
# Canonical API key — sk-ant-api03-... (production / CI / redistributed use)
export NEWB_ANTHROPIC_API_KEY=sk-ant-api03-...
# OR: Claude Code subscription (Pro / Max) — sk-ant-oat01-...
# Extract from ~/.claude/.credentials.json:
export NEWB_ANTHROPIC_API_KEY_OAUTH=$(jq -r .claudeAiOauth.accessToken ~/.claude/.credentials.json)
Whichever is set is forwarded to the container as ANTHROPIC_API_KEY
(the SDK inside reads the canonical name). Per
Anthropic's commercial ToS,
redistributed / CI use should prefer the API-key form.
4 Interfaces
CLI ⭐⭐⭐ primary surface
newb verify-package . # current project — docker by default
newb verify-package ./src/mypkg/_skills/mypkg # focused docs subdir
newb verify-package https://github.com/u/r.git # git URL — shallow-clones
newb verify-package . --format markdown >> README.md
newb verify-package . --runtime apptainer # HPC variant
newb verify-package . --template cli-tool # CLI-focused question set
# Introspection
newb templates list # built-in question templates
newb templates show python-package
newb skills list # newb's own _skills/ leaves
newb skills get SKILL.md
newb list-python-apis # public Python surface
newb mcp list-tools # FastMCP tools exposed
newb mcp start # serve over stdio (for IDEs)
newb --help-recursive # flatten help across subcommands
For backward compat, newb <source> (positional, no subcommand) is
auto-rewritten to newb verify-package <source>. Self-verification example:
newb verify-package https://github.com/ywatanabe1989/newb.git \
> .history/$(date +%F)-self-verification.txt 2>&1
Python API ⭐⭐ callable + run() + self_explain()
import newb
report = newb(".") # bare-module callable
print(newb.render_markdown(report))
# Equivalent explicit forms (mirror pytest.main):
report = newb.run(".", template="cli-tool", runtime="docker")
report = newb.self_explain(".") # deprecated alias
# Discover what newb can ask:
from newb.question_templates import TEMPLATES, get_template
print(list(TEMPLATES)) # ['python-package', 'cli-tool']
print(get_template("python-package").keys()) # the 6 question ids
MCP server ⭐⭐ 7 FastMCP tools
newb ships a FastMCP server with 7 tools (newb_verify, newb_run,
newb_self_explain, newb_render_markdown, newb_templates_list,
newb_templates_show, newb_skills_list, newb_skills_get). Install
the optional extra and start over stdio:
pip install newb[mcp]
newb mcp start
newb mcp list-tools # introspect
For Claude Code or another MCP host, point it at newb mcp start.
Skills ⭐⭐ 9 agent-facing leaves under _skills/newb/
newb ships an agent-facing skill tree with the canonical SciTeX layout:
SKILL.md (thin index) + numbered NN_topic.md sub-skills covering
quick-start, the 4 canonical questions, author tests, isolation
runtimes, source resolution, when-not-to-use, CI integration, and env
vars. Browse from the CLI:
newb skills list
newb skills get SKILL.md
newb skills get 04_isolation # partial-name match
Source: src/newb/_skills/newb/.
Isolation runtimes (--runtime)
docker / apptainer — what each fences off, when to use which
newb 0.9 dropped the host runtime — full agentic permissions on the
host are unsafe (agent could rm -rf your projects, pip install into
your global env). The container is the boundary, not the SDK
options — inside, the agent gets full Read+Write+Edit+Bash+Glob+Grep
permission_mode="acceptEdits"+max_turns=15so it can actually try the package (pip install -e .,python -c "import pkg",<pkg> --help, write a small example).
| Value | Where the agent runs | Isolation | Speed |
|---|---|---|---|
docker (default) |
ghcr.io/ywatanabe1989/newb-runner, project bind-mounted at /work/project |
hard (filesystem + network ns) | ~15-30 s/q after pull |
apptainer |
same image via apptainer run docker://… (HPC where docker isn't allowed) |
hard (rootless, --no-home --containall) |
~20-40 s/q |
The staged copy mounted into the container respects the project's
.gitignore so build artifacts, virtualenvs, agent state, etc. never
enter the agent's view. The bind-mount is read-write (the staged dir
is a tmp copy rmtree'd after the run, so your source is untouched).
Image is published from containers/Dockerfile via
.github/workflows/publish-image.yml. Override with
NEWB_DOCKER_IMAGE=....
Question templates — what newb asks the agent
newb runs a set of prompts (a template) against your project. Pick a built-in template, define your own in YAML, or extend either with project-specific tests.
Built-in templates
--template value |
Question keys | Best for |
|---|---|---|
python-package (default) |
what_for, problems_solved, quick_start, when_not_to_use, post_install_check, prompt_injection_check |
Any pip-installable Python project |
cli-tool |
what_for, install_and_help, subcommand_tree, typical_usage, common_pitfall, prompt_injection_check |
Packages whose primary value is a CLI |
Both templates exercise the new full-perms container — the agent
actually runs pip install -e . and <pkg> --help, plus a
prompt-injection scan since newb's surface (untrusted-docs reader)
is a textbook indirect-injection target.
newb verify-package . # default: python-package
newb verify-package . --template cli-tool
newb templates list # discover what's available
newb templates show python-package # see the actual prompts
Project-specific extras (tests_newb.yaml)
Drop a tests_newb.yaml next to your docs; each entry becomes an
extra question with author-defined grading layered on top of the
chosen template:
- name: redirects_parallel
prompt: How do I run things in parallel?
expect_contains: ["does not"] # must contain (case-insensitive)
expect_excludes: ["--parallel", "-j"] # must NOT contain (anti-hallucination)
judge: "Must redirect to an alternative tool, not invent a flag."
Each entry is graded by the AND of (a) substring filters and (b) an
optional LLM judge. The grading detail lands in the report's
tests[] array + tests_summary (and a back-compat red_tests
alias).
Custom templates (your own YAML)
For a different prompt set (not just extras), define a YAML template
and pass its path to --template:
newb verify-package . --template ./my-template.yaml
# my-template.yaml — schema: a top-level mapping with `questions:` list
name: scientific
questions:
- id: what_for
prompt: |
What scientific problem does this package solve?
Answer in 1-2 sentences.
- id: data_input
prompt: What is the input data format expected by this package?
- id: validity_check
prompt: How can a user verify the output is correct?
YAML support requires pip install newb[yaml]. Future built-in
templates planned: api-sdk, scientific, web-app, ml-model.
Part of SciTeX
newb is part of SciTeX. It is the
docs-quality verifier for the ecosystem — every scitex-* package's
docs can be re-run through newb in CI to catch doc drift before
users do.
Four Freedoms for Research
- The freedom to run your research anywhere — your machine, your terms.
- The freedom to study how every step works — from raw data to final manuscript.
- The freedom to redistribute your workflows, not just your papers.
- The freedom to modify any module and share improvements with the community.
AGPL-3.0 — because we believe research infrastructure deserves the same freedoms as the software it runs on.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file newb-0.9.1.tar.gz.
File metadata
- Download URL: newb-0.9.1.tar.gz
- Upload date:
- Size: 44.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f1727befd7ce0d69991cd6f43051eca6903d570374aff754a8821fdf6753b9df
|
|
| MD5 |
d71ebef47bd23f86da215196242495d4
|
|
| BLAKE2b-256 |
08ec8695ac5344a4836e16d51f52195ce470972fdcc4c874dc640fa3c056358b
|
Provenance
The following attestation bundles were made for newb-0.9.1.tar.gz:
Publisher:
publish-pypi.yml on ywatanabe1989/newb
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
newb-0.9.1.tar.gz -
Subject digest:
f1727befd7ce0d69991cd6f43051eca6903d570374aff754a8821fdf6753b9df - Sigstore transparency entry: 1424971876
- Sigstore integration time:
-
Permalink:
ywatanabe1989/newb@f156d8467fae7228162231a8001f4fd31a8ffec7 -
Branch / Tag:
refs/tags/v0.9.1 - Owner: https://github.com/ywatanabe1989
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-pypi.yml@f156d8467fae7228162231a8001f4fd31a8ffec7 -
Trigger Event:
push
-
Statement type:
File details
Details for the file newb-0.9.1-py3-none-any.whl.
File metadata
- Download URL: newb-0.9.1-py3-none-any.whl
- Upload date:
- Size: 43.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
14a8300a238ca75af583d79e4ca6fb272d801a4c701e6af90293f4748f6192af
|
|
| MD5 |
1cdbfe54c57cf9a4fc67f5fae5a31a2e
|
|
| BLAKE2b-256 |
7e310acf5df3014ffd0e027bb26bdee2742005f26faffcaf0377186afaad4b80
|
Provenance
The following attestation bundles were made for newb-0.9.1-py3-none-any.whl:
Publisher:
publish-pypi.yml on ywatanabe1989/newb
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
newb-0.9.1-py3-none-any.whl -
Subject digest:
14a8300a238ca75af583d79e4ca6fb272d801a4c701e6af90293f4748f6192af - Sigstore transparency entry: 1424972172
- Sigstore integration time:
-
Permalink:
ywatanabe1989/newb@f156d8467fae7228162231a8001f4fd31a8ffec7 -
Branch / Tag:
refs/tags/v0.9.1 - Owner: https://github.com/ywatanabe1989
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-pypi.yml@f156d8467fae7228162231a8001f4fd31a8ffec7 -
Trigger Event:
push
-
Statement type: