Skip to main content

Puras offline runner — run a skill's agent loop locally on your own LLM key.

Project description

Puras — local skill runner

Run Puras AI skills on your own machine, on your own LLM key — no account.

License: MIT PyPI Python

Docs · Cloud · Examples · Build a skill


A skill is a small folder — a prompt, an input/output schema, optional Python tools, and evals — that an agent runs end to end. This is the open-source runner that executes one entirely on your laptop: no Postgres, no bucket, no platform API, no sign-up. It's the same agent loop the hosted platform runs (one loop, two environments), so a skill behaves identically locally and in prod.

pip install "puras[local]"
puras run --local greeter --dir ./examples/hello-world -i name=Ada
  • 🧑‍💻 Local-first — run and iterate on a skill before deploying anything.
  • 🔌 Local APIpuras serve exposes the hosted job API on localhost, so you build and test your app offline before you deploy.
  • 🔑 Bring your own key — your provider, your bill, nothing billed by a platform.
  • 🪶 Dependency-light — the offline path needs no DB/bucket/openai stack.
  • 🔁 Prod parity — the same loop and contracts as Puras Cloud.

Getting started

Run it locally

The whole point of this repo — a skill's agent loop on your machine, on your key:

pip install "puras[local]"        # the puras CLI + the offline runner

# the bundled "hello world" skillpack has two skills: greeter + formatter
puras run --local greeter --dir ./examples/hello-world -i name="the Puras team"

It's BYO key: you call the provider directly and pay your own bill. The CLI reads your key from $ANTHROPIC_API_KEY (or the --api-key flag) and, if neither is set, prompts for it in the terminal.

From a checkout of this repo instead of PyPI:

pip install -e .             # puras-runner (the runtime)
pip install -e worker/sdk    # the puras CLI + SDK

…or use Puras Cloud (recommended for production)

The fastest way to run a skill with the full tool surface — media generation, web search/fetch, shared memory, persistent storage, durable resume, and eval suites at scale — is to sign up free at puras.co. No infrastructure to manage, automatic scaling, and a one-line MCP connect for Claude Code. The local runner here gives you the free, offline core; Cloud adds the managed, hosted surface for when you ship. The comparison below spells out exactly which is which.

Run a skill

puras run --local <skill> --dir <skillpack> -i KEY=VALUE [-i KEY2=VALUE2 ...]
  • <skill> — the skill to run; omit it when the bundle has exactly one.
  • --dir — the skillpack bundle root (a folder of <skill>/skill.yaml). Defaults to ..
  • -i KEY=VALUE — an input, repeatable; validated against the skill's input_schema.
  • --model claude/sonnet-4-6 — override the skill's model for this run.
  • --api-key sk-... — your LLM key, if it isn't already in the environment.

Events stream to your console as the agent works; the final JSON output and a token tally (informational — you paid your provider, not Puras) print at the end.

Programmatic use is the same loop:

from worker.local_run import run_local

res = run_local("./examples/hello-world", {"name": "Ada"}, skill="greeter")
print(res["output"])

Run a skill's evals

Evals are to a skill what unit tests are to code. If a skill declares an evals: block, run its suite locally and gate on it:

puras eval --local content-repurposer --dir ./examples/content-studio --threshold 80

check / exact_match / schema graders run free; a rubric (LLM-as-judge) grader runs on your BYO key. --threshold N is a CI gate — non-zero exit if the pass-rate is below N.

Build your app against a local API

puras run --local answers "does my skill work?". When you're building the app that calls the skill, you want the other half: a local server that speaks the same API your app will hit in production. That's puras serve:

puras serve --dir ./examples/hello-world          # → http://127.0.0.1:8787

It mirrors the hosted job API (POST /v1/jobs, GET /v1/jobs/{id}, …/events, …/spans) backed by the offline runner — in-memory, zero extra dependencies. Point any Puras SDK at it by changing one thing — the base URL — and your app runs unchanged, offline, on your own key:

import puras

# api_base is the only thing that differs between local and prod
client = puras.Client(api_key="local", api_base="http://127.0.0.1:8787", skillpack="local")
print(client.run("greeter", {"name": "Ada"}))
import { Puras } from "puras";
const puras = new Puras({ apiKey: "local", apiBase: "http://127.0.0.1:8787", skillpack: "local" });
console.log(await puras.run("greeter", { name: "Ada" }));

…or just curl it:

curl -s "http://127.0.0.1:8787/v1/jobs?wait=true" \
  -H "content-type: application/json" \
  -d '{"skill": "greeter", "inputs": {"name": "Ada"}}'

When you ship, change the base URL to https://api.puras.co and puras deploy — the same app code now runs against the managed platform. Auth is open locally; --require-key <token> emulates API-key auth, and --host / --port change where it binds. (The Python and React-Native SDKs poll, so they work as-is; live SSE streaming is a Cloud feature.)

Build your own skill

cp -r examples/skillpack-template my-skillpack
$EDITOR my-skillpack/my-skill/SKILL.md      # the prompt
$EDITOR my-skillpack/my-skill/skill.yaml    # schema, model, tools, evals
puras run --local --dir ./my-skillpack -i topic=otters
my-skillpack/
  my-skill/
    SKILL.md          # the agent's instructions (system prompt)
    skill.yaml        # input/output schema + model + tools + evals
    tools/...         # optional deterministic Python tools the agent can call

When you're happy, the same bundle deploys to Puras Cloud unchanged. See the docs to deploy and call skills over the API.

Open-source vs Cloud

Puras is open-core: the runner — the agent loop and the local tool surface — is MIT-licensed and runs fully offline, forever. The hosted platform at puras.co is how the project is sustainably funded, and it adds the managed surface that can't exist on a single laptop. Premium isn't a crippled core — it's the capabilities that need real infrastructure.

Local runner (this repo, MIT) Puras Cloud (hosted)
Setup & maintenance pip install, you run it Fully managed, nothing to install
LLM key & billing Bring your own key, you pay the provider Managed, usage-based, transparent pricing
Agent loop & local tools ✓ text, bash, file tools, your Python tools, in-process subagents ✓ same loop
Job API for your app puras serve — the job API on localhost ✓ api.puras.co — managed, scaled, durable
Evals (check/schema/rubric) ✓ per run + offline suites ✓ + suites at scale, CI gating, version diffs
Media (image/video/audio) ✓ generation + persistence
Web search / fetch / browser
Shared memory ✓ persistent, local SQLite ✓ persistent, workspace-scoped + semantic (pgvector)
Persistent storage ✓ bucket-backed drive
Durable resume ✓ checkpointed, survives worker restarts
Budgets, tracing, dashboard console events + a token tally ✓ spend budgets, OTel spans, run timelines
Marketplace & sharing
Support Issues & Discussions priority / SLA

The hosted-only tools (worker/agent_runner.py:PLATFORM_ONLY_TOOLS) are simply not offered to the model offline, so a skill that needs them still runs — it just won't see those tools locally. The included examples (hello-world, skillpack-template, content-studio) use only the local surface and run end-to-end offline.

Workspace memory works offline too. The memory_search / memory_get / memory_put / memory_forget tools and the job-start memory injection are backed by a local SQLite file (worker/memory_store_sqlite.py) — the same agent-facing contract and the same hybrid ranking (exact identity + lexical, RRF-fused and shaped by recency × importance) as the hosted Postgres store, just without the pgvector semantic arm. So a skill's "shared brain" persists across puras run --local invocations. The DB lives next to the local drive root by default; point it anywhere with PURAS_LOCAL_MEMORY_PATH. Hosted memory adds semantic (embedding) retrieval and cross-workspace scope on top.

How it works

The runner runs the agent on a LocalRunContext with platform_enabled=False (but memory_enabled=True — its memory_backend() returns the SQLite store). That RunContext seam is the whole trick: one agent loop, two environments. The offline import path is kept dependency-light — no Postgres/bucket/openai at import time — and that's enforced by tests/dry/test_local_import_isolation.py.

Path What
worker/ The puras-runner runtime — the agent loop (agent_runner.py), the RunContext seam, the local entrypoint (local_run.py), skill loading, the eval runner.
worker/sdk/ The puras package — the CLI and the SDK skills import at runtime. Ships as its own wheel.
examples/ Runnable, offline-capable skillpacks.
tests/ The dependency-light import-isolation guard, a CLI smoke test, and the puras serve API tests.

Community & contributing

Questions, bugs, and skill ideas are welcome in Issues and Discussions. PRs that improve the runner, the docs, or the examples are appreciated.

License

MIT. The hosted platform's server-side code is separate and commercial — this runner, the SDK, and the examples are MIT and yours to use, modify, and self-run.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

puras_runner-0.2.0.tar.gz (196.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

puras_runner-0.2.0-py3-none-any.whl (222.3 kB view details)

Uploaded Python 3

File details

Details for the file puras_runner-0.2.0.tar.gz.

File metadata

  • Download URL: puras_runner-0.2.0.tar.gz
  • Upload date:
  • Size: 196.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for puras_runner-0.2.0.tar.gz
Algorithm Hash digest
SHA256 3ff7122158874bad772d48c389d636f23c1663579d96d195a01892910a326761
MD5 a436eb0fb21fc8bb9378b5e5f5a9779f
BLAKE2b-256 deba854339829c457224a8bbd126bda91096e462f5bb43a3cab8704bbadf262a

See more details on using hashes here.

Provenance

The following attestation bundles were made for puras_runner-0.2.0.tar.gz:

Publisher: publish.yml on PurasAI/puras

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file puras_runner-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: puras_runner-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 222.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for puras_runner-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 7e7c772504e3ce7fdadedb2c50cca43540d155c003dbf90fbc9ba19d957174a4
MD5 661738aa0226f9f8ce4ea253b756c3cf
BLAKE2b-256 410677c797039f7fcd9996a26f3bb3aa82cf4653638ff54453744d7460c58932

See more details on using hashes here.

Provenance

The following attestation bundles were made for puras_runner-0.2.0-py3-none-any.whl:

Publisher: publish.yml on PurasAI/puras

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page