A fresh AI agent tries to use your package — pytest-style. If it succeeds, your docs work.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

ywatanabe1989

These details have not been verified by PyPI

Project description

newb

Test your package through the eyes of a newbie agent — a fresh AI agent reads only your docs and tries to use your package. If it succeeds, your docs work.

Full Documentation · pip install newb

Python 3.10+ · bundles claude-agent-sdk (Anthropic, MIT) · newb itself AGPL-3.0-only · auth: NEWB_ANTHROPIC_API_KEY or local ~/.claude/ OAuth

Problem and Solution

#	Problem	Solution
1	What a package is for and how it works isn't obvious. Authors know their own surface; readers don't.	newb asks four canonical questions automatically — what for, problems solved, quick start, when not to use — and reports back what a fresh reader actually understood.
2	In this era, the first-class reader of a package is an AI agent, not a human scrolling through README hash-anchors. Docs that read well to humans can still be unusable to agents.	newb tests docs through the actual reader: a fresh `claude-agent-sdk` session with `setting_sources=[]`, `allowed_tools=["Read"]`, `cwd=<staged copy>` — no host CLAUDE.md, no Bash, no Write.
3	Learning a new package is hard for users. No quick start, missing edge cases, undocumented "when not to use" — all silent failures.	A failing newb run names exactly which question the docs couldn't answer, with the agent's own response — surfacing gaps before users hit them.
4	Maintaining doc quality across many packages doesn't scale. Manual review per release, per package, per branch is the bottleneck for ecosystem-wide quality.	One CLI per package; JSON output for CI; runs in isolation (`host` / `docker` / `apptainer`); pluggable graders (substring + LLM judge) via `tests_newb.yaml`. Plug into a CI matrix and quality scales with your portfolio.

How it works

HOST                                                       DOCKER CONTAINER (ghcr.io/.../newb-runner)
┌──────────────────────────────────┐                       ┌──────────────────────────────────────────────┐
│  Your project root               │                       │  /work/project   (rw bind-mount)             │
│  (auto-detected — dir with       │                       │    ├── README.md, src/, tests/, examples/    │
│   .git / pyproject.toml /        │                       │    ├── _skills/<pkg>/   ← prompt focus       │
│   setup.py / package.json /      │                       │    └── tests_newb.yaml   (optional)          │
│   Cargo.toml / go.mod)           │                       │                                              │
│                                  │   docker run --rm     │  claude-agent-sdk (Anthropic, MIT)           │
│  ├── stage to                    │   --network bridge    │    ClaudeAgentOptions(                       │
│  │   /tmp/newb-stage-XXX/        │   -v <staged>:rw      │      cwd="/work/project",                    │
│  │   project/   (rw — agent      │   -e ANTHROPIC_API…   │      allowed_tools=["Read","Write","Edit",   │
│  │   needs to pip install)       │   -e NEWB_MODEL       │                     "Bash","Glob","Grep"],   │
│  │                               │   -e NEWB_SKILLS_PATH │      permission_mode="acceptEdits",          │
│  ├── filter via                  │ ────────────────────► │      setting_sources=[],   # no host CLAUDE  │
│  │   `git ls-files --cached      │                       │      max_turns=15,                           │
│  │     --others                  │                       │    )                                         │
│  │     --exclude-standard`       │                       │                                              │
│  │   (or hardcoded ignore        │   stdout = answer     │  agent can ACTUALLY try the package:         │
│  │   list for non-git dirs;      │ ◄──────────────────── │    pip install -e .                          │
│  │   broken symlinks dropped)    │                       │    python -c "import <pkg>"                  │
│  │                               │                       │    <pkg> --help                              │
│  └── one prompt per question     │                       │    write a small example, run a test         │
│      from the chosen template    │                       │  Returns ResultMessage.result per query.     │
│      + one per tests_newb.yaml   │                       │                                              │
│      (questions sent in fresh    │                       │                                              │
│       sessions — no shared       │                       │                                              │
│       conversation state)        │                       │                                              │
└──────────────────────────────────┘                       └──────────────────────────────────────────────┘
                │
                ▼
        ┌────────────────────────────────────┐
        │  Report                            │
        │    package, template               │
        │    what_for, problems_solved,      │
        │    quick_start, when_not_to_use,   │
        │    post_install_check,             │
        │    prompt_injection_check          │
        │    tests[] (substring + LLM judge) │
        │    tests_summary                   │
        └────────────────────────────────────┘

Three layers, one responsibility each: container = isolation, SDK options = agent behavior, agent = exploration. newb owns the test schema (canonical questions + tests_newb.yaml + graders + report rendering); the SDK owns everything else: session lifecycle, transport, message structuring, tool execution. Runtime details and backend comparison live in Isolation runtimes below.

Installation

pip install newb           # core (CLI + Python API)
pip install newb[yaml]     # + custom YAML templates / tests_newb.yaml
pip install newb[mcp]      # + FastMCP server (newb mcp start)
pip install newb[all]      # everything above

claude-agent-sdk (Anthropic, MIT) is pulled in as a dependency.

Auth — NEWB_-prefixed env vars only (no upstream surprises)

newb owns its own env namespace and never silently inherits the upstream ANTHROPIC_API_KEY. One opt-in var, opaque to newb:

# Real Anthropic API key (production / CI / redistributed use)
export NEWB_ANTHROPIC_API_KEY=sk-ant-api03-...

# OR: a Claude Code Pro / Max OAuth access token. Extract from
# ~/.claude/.credentials.json:
export NEWB_ANTHROPIC_API_KEY=$(jq -r .claudeAiOauth.accessToken ~/.claude/.credentials.json)

The Anthropic backend accepts both sk-ant-api* (API keys) and sk-ant-oat* (Claude Code OAuth access tokens) on the same Authorization header — newb forwards the value verbatim into the container, where the bundled CLI promotes it to ANTHROPIC_API_KEY. Per Anthropic's commercial ToS, redistributed / CI use should prefer the API-key form.

4 Interfaces

CLI ⭐⭐⭐ _{primary surface}

newb .                              # current project — docker by default
newb ./src/mypkg/_skills/mypkg      # focused docs subdir
newb https://github.com/u/r.git     # git URL — shallow-clones
newb . --format markdown >> README.md
newb . --runtime apptainer          # HPC variant
newb . --template cli-tool          # CLI-focused question set

# Introspection
newb templates list                        # built-in question templates
newb templates show python-package
newb skills list                           # newb's own _skills/ leaves
newb skills get SKILL.md
newb list-python-apis                      # public Python surface
newb mcp list-tools                        # FastMCP tools exposed
newb mcp start                             # serve over stdio (for IDEs)
newb --help-recursive                      # flatten help across subcommands

pytest-style: newb <target> is the canonical invocation — no verb in front. Subcommands (templates, skills, mcp, list-python-apis) are introspection-only.

Self-verification example:

newb https://github.com/ywatanabe1989/newb.git \
  > .history/$(date +%F)-self-verification.txt 2>&1

Python API ⭐⭐ _{callable + run()}

import newb
report = newb(".")                                       # bare-module callable
print(newb.render_markdown(report))

# Equivalent explicit form (mirrors pytest.main):
report = newb.run(".", template="cli-tool", runtime="docker")

# Discover what newb can ask:
from newb.question_templates import TEMPLATES, get_template
print(list(TEMPLATES))                                   # ['python-package', 'cli-tool']
print(get_template("python-package").keys())             # the 6 question ids

MCP server ⭐⭐ _{7 FastMCP tools}

newb ships a FastMCP server with these tools (newb_verify, newb_run, newb_render_markdown, newb_templates_list, newb_templates_show, newb_skills_list, newb_skills_get). Install the optional extra and start over stdio:

pip install newb[mcp]
newb mcp start
newb mcp list-tools             # introspect

For Claude Code or another MCP host, point it at newb mcp start.

Skills ⭐⭐ _{9 agent-facing leaves under _skills/newb/}

newb ships an agent-facing skill tree with the canonical SciTeX layout: SKILL.md (thin index) + numbered NN_topic.md sub-skills covering quick-start, the 4 canonical questions, author tests, isolation runtimes, source resolution, when-not-to-use, CI integration, and env vars. Browse from the CLI:

newb skills list
newb skills get SKILL.md
newb skills get 04_isolation        # partial-name match

Source: src/newb/_skills/newb/.

Isolation runtimes (`--runtime`)

docker / apptainer — what each fences off, when to use which

newb 0.9 dropped the host runtime — full agentic permissions on the host are unsafe (agent could rm -rf your projects, pip install into your global env). The container is the boundary, not the SDK options — inside, the agent gets full Read+Write+Edit+Bash+Glob+Grep

permission_mode="acceptEdits" + max_turns=15 so it can actually try the package (pip install -e ., python -c "import pkg", <pkg> --help, write a small example).

Value	Where the agent runs	Isolation	Speed
`docker` (default)	`ghcr.io/ywatanabe1989/newb-runner`, project bind-mounted at `/work/project`	hard (filesystem + network ns)	~15-30 s/q after pull
`apptainer`	same image via `apptainer run docker://…` (HPC where docker isn't allowed)	hard (rootless, `--no-home --containall`)	~20-40 s/q

The staged copy mounted into the container respects the project's .gitignore so build artifacts, virtualenvs, agent state, etc. never enter the agent's view. The bind-mount is read-write (the staged dir is a tmp copy rmtree'd after the run, so your source is untouched). Image is published from containers/Dockerfile via .github/workflows/publish-image.yml. Override with NEWB_DOCKER_IMAGE=....

Question templates — what newb asks the agent

newb runs a set of prompts (a template) against your project. Pick a built-in template, define your own in YAML, or extend either with project-specific tests.

Built-in templates

`--template` value	Question keys	Best for
`python-package` (default)	`what_for`, `problems_solved`, `quick_start`, `when_not_to_use`, `post_install_check`, `prompt_injection_check`	Any pip-installable Python project
`cli-tool`	`what_for`, `install_and_help`, `subcommand_tree`, `typical_usage`, `common_pitfall`, `prompt_injection_check`	Packages whose primary value is a CLI

Both templates exercise the new full-perms container — the agent actually runs pip install -e . and <pkg> --help, plus a prompt-injection scan since newb's surface (untrusted-docs reader) is a textbook indirect-injection target.

newb .                              # default: python-package
newb . --template cli-tool
newb templates list                        # discover what's available
newb templates show python-package         # see the actual prompts

Project-specific extras (tests_newb.yaml)

Drop a tests_newb.yaml next to your docs; each entry becomes an extra question with author-defined grading layered on top of the chosen template:

- name: redirects_parallel
  prompt: How do I run things in parallel?
  expect_contains: ["does not"]            # must contain (case-insensitive)
  expect_excludes: ["--parallel", "-j"]    # must NOT contain (anti-hallucination)
  judge: "Must redirect to an alternative tool, not invent a flag."

Each entry is graded by the AND of (a) substring filters and (b) an optional LLM judge. The grading detail lands in the report's tests[] array and tests_summary.

Custom templates (your own YAML)

For a different prompt set (not just extras), define a YAML template and pass its path to --template:

newb . --template ./my-template.yaml

# my-template.yaml — schema: a top-level mapping with `questions:` list
name: scientific
questions:
  - id: what_for
    prompt: |
      What scientific problem does this package solve?
      Answer in 1-2 sentences.
  - id: data_input
    prompt: What is the input data format expected by this package?
  - id: validity_check
    prompt: How can a user verify the output is correct?

YAML support requires pip install newb[yaml]. Future built-in templates planned: api-sdk, scientific, web-app, ml-model.

Security disclaimer

newb runs an AI agent against arbitrary package documentation, which is an unsolved-by-default attack surface. Read this before using.

Threats we recognize:

Indirect prompt injection via package READMEs, docstrings, and tests_newb.yaml
API key exfiltration via agent output (/proc/self/environ, encoded leaks)
Container escape attempts (kernel CVEs, capability misconfiguration)
Network exfiltration to attacker-controlled hosts
Resource exhaustion (fork bombs, memory hogs)

What we implement:

Container as the boundary — Docker / Apptainer with --cap-drop=ALL, --security-opt=no-new-privileges, default --network=bridge
Configurable hardening — opt-in resource caps, --network=none, etc., via NEWB_HARDEN_* env vars or CLI flags
Bundled CLI runs with setting_sources=[] — host ~/.claude/CLAUDE.md never reaches the agent
Optional newb[security] extra — Protect AI's deberta-v3-base-prompt-injection-v2 for pre-flight scanning
Self-check question — agent reports any adversarial content it noticed
See docs/security/threat-model.md for the full Rule-of-Two analysis

What we cannot promise:

Prompt injection is unsolved at the model level (per Meta's Agents Rule of Two, OWASP LLM01) — research consensus reports >85% attack success against state-of-the-art defenses with adaptive attacks
Sophisticated, novel, or encoded injection attempts may bypass every layer above
We cannot accept responsibility for any consequence of running newb against untrusted package documentation

Use at your own risk. Pin a specific newb version and image digest in CI, treat verdicts on adversarially-authored packages as heuristic only, and never run newb with credentials beyond what a single dev-loop verification needs.

Part of SciTeX

newb is part of SciTeX. It is the docs-quality verifier for the ecosystem — every scitex-* package's docs can be re-run through newb in CI to catch doc drift before users do.

Four Freedoms for Research

The freedom to run your research anywhere — your machine, your terms.

The freedom to study how every step works — from raw data to final manuscript.

The freedom to redistribute your workflows, not just your papers.

The freedom to modify any module and share improvements with the community.

AGPL-3.0 — because we believe research infrastructure deserves the same freedoms as the software it runs on.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

ywatanabe1989

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.25.0

May 3, 2026

0.23.0

May 3, 2026

0.22.2

May 3, 2026

0.22.1

May 2, 2026

0.22.0

May 2, 2026

0.21.0

May 2, 2026

0.20.0

May 2, 2026

0.19.1

May 2, 2026

0.19.0

May 2, 2026

0.18.1

May 2, 2026

This version

0.18.0

May 2, 2026

0.17.1

May 2, 2026

0.17.0

May 2, 2026

0.16.1

May 2, 2026

0.16.0

May 2, 2026

0.15.0

May 2, 2026

0.14.0

May 2, 2026

0.13.0

May 2, 2026

0.12.0

May 2, 2026

0.11.0

May 2, 2026

0.10.2

May 2, 2026

0.10.0

May 2, 2026

0.9.2

May 2, 2026

0.9.1

May 2, 2026

0.8.0

May 1, 2026

0.7.0

May 1, 2026

0.6.0

May 1, 2026

0.5.4

May 1, 2026

0.5.2

May 1, 2026

0.5.1

May 1, 2026

0.5.0

May 1, 2026

0.4.0

May 1, 2026

0.3.2

May 1, 2026

0.3.1

May 1, 2026

0.3.0

May 1, 2026

0.2.0

May 1, 2026

0.1.0

May 1, 2026

0.0.1

May 1, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

newb-0.18.0.tar.gz (63.9 kB view details)

Uploaded May 2, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

newb-0.18.0-py3-none-any.whl (67.9 kB view details)

Uploaded May 2, 2026 Python 3

File details

Details for the file newb-0.18.0.tar.gz.

File metadata

Download URL: newb-0.18.0.tar.gz
Upload date: May 2, 2026
Size: 63.9 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for newb-0.18.0.tar.gz
Algorithm	Hash digest
SHA256	`84935693cd7f8344ea831363f7ec463337cbcf75c24ad8459d63fb52514ec88c`
MD5	`02b7ef8dfb8e6feef742048e0855c008`
BLAKE2b-256	`7f202cda5c5ce3c1bf0a8c5525502f7278d47c6ef8b4efb5415ac1f7dee45c7d`

See more details on using hashes here.

Provenance

The following attestation bundles were made for newb-0.18.0.tar.gz:

Publisher: publish-pypi.yml on ywatanabe1989/newb

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: newb-0.18.0.tar.gz
- Subject digest: 84935693cd7f8344ea831363f7ec463337cbcf75c24ad8459d63fb52514ec88c
- Sigstore transparency entry: 1428825521
- Sigstore integration time: May 2, 2026
Source repository:
- Permalink: ywatanabe1989/newb@a15a7e4dc352b86335f84bbeb5cc11ce1fc291e7
- Branch / Tag: refs/tags/v0.18.0
- Owner: https://github.com/ywatanabe1989
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-pypi.yml@a15a7e4dc352b86335f84bbeb5cc11ce1fc291e7
- Trigger Event: push

File details

Details for the file newb-0.18.0-py3-none-any.whl.

File metadata

Download URL: newb-0.18.0-py3-none-any.whl
Upload date: May 2, 2026
Size: 67.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for newb-0.18.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`00b6c569130044a74440c36bcdf6bfe89379e60a77f4c0cb263fd505d519303b`
MD5	`2058f709428165360eb6ca7b68dd7af1`
BLAKE2b-256	`b077db98631486f3b28f944d4a5957da4863ace815c51f1cdf2bb7f3d31b971d`

See more details on using hashes here.

Provenance

The following attestation bundles were made for newb-0.18.0-py3-none-any.whl:

Publisher: publish-pypi.yml on ywatanabe1989/newb

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: newb-0.18.0-py3-none-any.whl
- Subject digest: 00b6c569130044a74440c36bcdf6bfe89379e60a77f4c0cb263fd505d519303b
- Sigstore transparency entry: 1428825547
- Sigstore integration time: May 2, 2026
Source repository:
- Permalink: ywatanabe1989/newb@a15a7e4dc352b86335f84bbeb5cc11ce1fc291e7
- Branch / Tag: refs/tags/v0.18.0
- Owner: https://github.com/ywatanabe1989
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-pypi.yml@a15a7e4dc352b86335f84bbeb5cc11ce1fc291e7
- Trigger Event: push

newb 0.18.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Meta

Unverified details

Meta

Classifiers

Project description

newb

Problem and Solution

How it works

Installation

4 Interfaces

Isolation runtimes (--runtime)

Question templates — what newb asks the agent

Security disclaimer

Part of SciTeX

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Meta

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

Isolation runtimes (`--runtime`)