Skip to main content

Test your package through the eyes of a newbie agent — a fresh AI agent reads only your docs/skills and tries to use your package.

Project description

newb

newb mascot

A fresh AI agent reads only your docs and tries to use your package. If it succeeds, your docs work.

How it works

┌──────────────────────┐    ┌────────────────────────────────────┐    ┌──────────────────────┐
│   Your package       │    │   claude-agent-sdk                 │    │   Report             │
│                      │    │   (Anthropic, official)            │    │                      │
│   ./docs/   or       │    │                                    │    │   what_for           │
│   ./_skills/<pkg>/   │ →  │   fresh Claude Code session,       │ →  │   problems_solved    │
│   tests_newb.yaml    │    │   setting_sources=[],              │    │   quick_start        │
│   (optional)         │    │   allowed_tools=["Read"]           │    │   when_not_to_use    │
│                      │    │   cwd=<staged copy of your docs>   │    │   tests[] (pass/fail)│
└──────────────────────┘    └────────────────────────────────────┘    └──────────────────────┘
            │                              ▲
            │                              │  newb sends N prompts via the SDK's
            └────── stages copy ───────────┘  ``query()`` async iterator
                                              (structured streaming, no --print)

newb owns the test schema (4 canonical questions + tests_newb.yaml

  • graders + report rendering). The SDK owns everything else: session lifecycle, transport, message structuring, tool execution. No docker, no multiplexer, no wire format — just a Python import.

Install

pip install newb

claude-agent-sdk (Anthropic, MIT) is pulled in as a dependency.

Use

newb ./docs                                # any dir of .md files
newb ./src/mypkg/_skills/mypkg             # standard SciTeX layout
newb https://github.com/user/repo.git      # git URL — shallow-clones
newb ./docs --format markdown >> README.md
newb ./docs --runtime docker               # hard isolation in container
newb ./docs --runtime apptainer            # HPC variant

Isolation runtimes (--runtime)

Value Where the agent runs Isolation Speed
host host subprocess via claude-agent-sdk soft (Read tool can technically reach host fs) ~10-15s/q
docker ghcr.io/ywatanabe1989/newb-runner container, only <staged> mounted ro hard (real fs + network ns) ~15-20s/q
apptainer same image via apptainer run docker://... (HPC) hard (rootless) ~20-30s/q

The container image is published from containers/Dockerfile in this repo via .github/workflows/publish-image.yml. Override the image with NEWB_DOCKER_IMAGE=....

Asks a fresh Claude agent four canonical questions (what for / problems solved / quick start / when not to use), plus any author-defined tests in tests_newb.yaml. Output: JSON (for CI) or markdown.

Author tests (tests_newb.yaml)

- name: redirects_parallel
  prompt: How do I run things in parallel?
  expect_contains: ["does not"]
  judge: "Must redirect to an alternative tool, not hallucinate."

Each test combines optional substring grading and an optional LLM judge.

Auth

export ANTHROPIC_API_KEY=sk-ant-api03-...     # canonical, ToS-clean

Per Anthropic's commercial ToS, products built on the Claude Agent SDK should use API key auth. On a personal machine where you've run claude to authenticate, an existing ~/.claude/ OAuth login also works (the SDK's bundled CLI inherits it), but Anthropic doesn't sanction this for redistributed products.

Isolation

setting_sources=[] skips your host CLAUDE.md / .claude/ settings. allowed_tools=["Read"] limits the agent to file reads. cwd is set to a tmp copy of your skills dir, so the agent's filesystem horizon is the package's own content. No Bash, no Write, no WebFetch.

Library

import newb
report = newb("./docs")
print(newb.render_markdown(report))

Requirements

  • Python 3.10+
  • claude-agent-sdk (auto-installed; bundles a Claude CLI)
  • $ANTHROPIC_API_KEY (or local claude login)

License

newb itself: AGPL-3.0-only. The bundled claude-agent-sdk: MIT, governed by Anthropic's Commercial Terms of Service.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

newb-0.8.0.tar.gz (17.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

newb-0.8.0-py3-none-any.whl (15.4 kB view details)

Uploaded Python 3

File details

Details for the file newb-0.8.0.tar.gz.

File metadata

  • Download URL: newb-0.8.0.tar.gz
  • Upload date:
  • Size: 17.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for newb-0.8.0.tar.gz
Algorithm Hash digest
SHA256 ca61cb0042e0a06c4c832e2e3eeecceed8a353470d3b66dcdeabf4528b078d25
MD5 8b4ab41f8adcf798006dd6f544d4575b
BLAKE2b-256 4c72fb09967f53ad9b051682a8aafc999183bd974ca8f8a075d11d4f06ad48ca

See more details on using hashes here.

File details

Details for the file newb-0.8.0-py3-none-any.whl.

File metadata

  • Download URL: newb-0.8.0-py3-none-any.whl
  • Upload date:
  • Size: 15.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for newb-0.8.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b0a875f154a37740c17b0c9b2f38304270e3fc808594b3bd99a5fb20defc7d87
MD5 1d95a4f13cab4b0c4ccc9e0017652bf9
BLAKE2b-256 6cc2897a3712653428b0a81c199abfd4fee694ad0d5ea961518a61007fe4ec26

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page