doer — one-file pipe-native agent. strands-agents + ollama only.

These details have not been verified by PyPI

Project links

Project description

DOER

`stdin → agent → stdout`

A Unix citizen that thinks. In text, images, audio, and video.

📖 Full documentation →

install

# pipx — isolated, auto-updatable (recommended)
pipx install doer-cli

# pip — any venv
pip install doer-cli

# optional extras
pip install 'doer-cli[mlx]'   # local inference + LoRA training (Apple Silicon)
pip install 'doer-cli[vlm]'   # vision/audio/video + VLM LoRA
pip install 'doer-cli[hf]'    # huggingface dataset upload
pip install 'doer-cli[gr00t]' # Isaac GR00T policy-server client (ZMQ)
pip install 'doer-cli[all]'   # everything

Two binaries land on $PATH: do (short) and doer (long).

do is a shell keyword in bash/zsh loops — the binary still works because of the argument (do "hi" is unambiguous), but if tab-completion misbehaves, alias it:
echo 'alias do="doer"' >> ~/.zshrc

run

# text
do "find files larger than 100MB"

# pipe
cat error.log | do "what broke"
git log -20   | do "write release notes"
curl -s api.io | do "summarize" | tee out.md

# multimodal (auto-routes to mlx-vlm on Apple Silicon)
do --img screenshot.png "what's in this UI?"
do --audio meeting.wav   "transcribe and bullet the action items"
do --video clip.mp4      "what's happening here?"
do --img a.png --audio b.wav "..."   # omni model (auto-picked)

what it is

Agent(
    model=<auto: bedrock | ollama | mlx | mlx-vlm>,
    tools=[shell] + hot_reload("./tools"),
    system_prompt=SOUL.md + AGENTS.md + ~/.doer_history
                + ~/.bash_history + ~/.zsh_history + own_source,
)(stdin + argv + images + audio + video)

One file. doer/__init__.py — ~730 lines. Reads your shell like a person reads a room. Trains on its own transcripts. Swaps its own brain.

context it sees every call

source	what
`SOUL.md` (cwd)	who it is in this project
`AGENTS.md` (cwd)	rules for this project
`~/.doer_history`	last N Q/A (`DOER_HISTORY=10`)
`~/.bash_history` + `~/.zsh_history`	last N commands (`DOER_SHELL_HISTORY=20`)
`./tools/*.py`	hot-reloaded `@tool` functions
own source	full self-awareness
`--img / --audio / --video`	raw media sent to a VLM (routed automatically)

No database. No config file. The filesystem is the memory.

providers

Auto-picked from what's on your machine. Override with DOER_PROVIDER.

provider	when	model default
`bedrock`	AWS creds present (`AWS_BEARER_TOKEN_BEDROCK` / STS / SSO)	`global.anthropic.claude-opus-4-7` (1M ctx)
`mlx-vlm`	Apple Silicon + `[vlm]` extra + `--img/--audio/--video`	`Qwen2.5-VL-3B` / `gemma-3n` / `Qwen3-Omni`
`mlx`	Apple Silicon + `[mlx]` extra (text-only, trained adapter)	`mlx-community/Qwen3-1.7B-4bit`
`ollama`	fallback — local, private, no keys	`qwen3:1.7b`

# force a provider
DOER_PROVIDER=ollama do "quick ping"
DOER_PROVIDER=mlx DOER_ADAPTER=~/.doer_adapter do "use my trained self"

env knobs

# provider selection
DOER_PROVIDER=                   # "" (auto) | bedrock | ollama | mlx | mlx-vlm

# bedrock (defaults tuned for Claude Opus 4.7)
DOER_BEDROCK_MODEL=global.anthropic.claude-opus-4-7
DOER_BEDROCK_REGION=us-west-2
DOER_MAX_TOKENS=128000           # Opus 4.7 native max
DOER_TEMPERATURE=                # unset on Opus 4.7+ (returns 400 otherwise)
DOER_TOP_P=                      # unset on Opus 4.7+
DOER_CACHE_PROMPT=               # "1" / "true" → prompt caching
DOER_ANTHROPIC_BETA=context-1m-2025-08-07   # csv; auto on Claude — "" to disable
DOER_ADDITIONAL_REQUEST_FIELDS=  # raw JSON escape hatch
DOER_BEDROCK_GUARDRAIL_ID=       # optional Bedrock guardrail
DOER_BEDROCK_GUARDRAIL_VERSION=

# ollama
DOER_MODEL=qwen3:1.7b
OLLAMA_HOST=http://localhost:11434

# mlx (Apple Silicon)
DOER_MLX_MODEL=mlx-community/Qwen3-1.7B-4bit
DOER_ADAPTER=                    # path to trained text LoRA
DOER_MLX_VLM_MODEL=mlx-community/Qwen2.5-VL-3B-Instruct-4bit
DOER_MLX_AUDIO_MODEL=mlx-community/gemma-3n-E2B-it-4bit
DOER_MLX_OMNI_MODEL=mlx-community/Qwen3-Omni-30B-A3B-Instruct-4bit
DOER_VLM_ADAPTER=                # path to trained VLM LoRA

# context
DOER_HISTORY=10                  # Q/A rows in prompt
DOER_SHELL_HISTORY=20            # shell rows in prompt
DOER_DEBUG=                      # "1" → verbose errors
DOER_MAX_SEQ_LEN=16384           # LoRA training max seq length

# huggingface upload
DOER_HF_REPO=<user>/doer-training   # override target
HF_TOKEN=                           # or `huggingface-cli login`

extend in 60 seconds

# ./tools/weather.py
from strands import tool
import urllib.request

@tool
def weather(city: str) -> str:
    """Weather for a city."""
    return urllib.request.urlopen(f"https://wttr.in/{city}?format=3").read().decode()

Next call: do "istanbul weather?" — hot-reloaded, no restart.

the loop: collect → train → swap

doer closes its own training loop. Every call appends a dense, self-contained record to ~/.doer_training.jsonl — full system prompt, all messages, tool specs, native <tool_call> tokens preserved.

# 1. collect (automatic — just use doer)
do "fix this stacktrace" < err.log
do --img ui.png "label the bugs in this screenshot"
# ... 100+ real turns across text/image/audio/video

# 2. inspect
do --train-status
# → 127 turns | 2453.1KB | sha256:250c406b | ~/.doer_training.jsonl
#     text:102  image:18  audio:3  video:4
#     hf:    cagataydev/doer-training | in sync

# 3. train — in-process LoRA (no trainer indirection, ~50 lines calling mlx_lm.tuner)
do --train 200                # text LoRA     → ~/.doer_adapter
do --train-vlm 300            # vision LoRA   → ~/.doer_vlm_adapter

# 4. use your trained self
DOER_PROVIDER=mlx     DOER_ADAPTER=~/.doer_adapter         do "fix this" < err.log
DOER_PROVIDER=mlx-vlm DOER_VLM_ADAPTER=~/.doer_vlm_adapter do --img x.png "what's this?"

Training preserves native tool-call tokens via the tokenizer's chat template — your adapter learns real tool-use, not string mimicry.

train in the cloud (HuggingFace Jobs)

Laptop LoRA is great for 500-turn datasets on Qwen3-1.7B. When you want to scale up — bigger models, full fine-tunes, VLM, Omni — burn HF credits instead of your battery.

# one-shot dispatchers (UV scripts, zero setup)
doer --hf-jobs gen                     # Generate dataset (CPU, ~$0.60/500 prompts)
doer --hf-jobs text                    # Qwen3-1.7B LoRA, T4, ~$0.30
doer --hf-jobs vlm                     # Qwen2.5-VL-3B LoRA, A100, ~$5
doer --hf-jobs omni                    # Qwen2.5-Omni-7B, H200, ~$10

# generate dataset from your own prompts (one per line, dedupes against existing)
doer --hf-jobs gen my_prompts.txt --iters 500
doer --hf-jobs gen hf://Anthropic/hh-rlhf:chosen --iters 1000

# override anything via env or flags
MODEL=Qwen/Qwen3-4B FLAVOR=a10g-large doer --hf-jobs text --iters 1000
PROVIDER=ollama CONCURRENCY=16 doer --hf-jobs gen prompts.txt

# monitor
doer --hf-jobs ps
doer --hf-jobs logs <job_id>
doer --hf-jobs hw          # list hardware + cost/hour

Under the hood each dispatcher is one self-contained UV script bundled inside doer/hf_jobs/ — no repo cloning, no Dockerfile. The script pulls cagataydev/doer-training (your dataset), runs SFT LoRA with trl + peft, merges the adapter, and pushes the full merged model to cagataydev/doer-<model-short> automatically.

Validated end-to-end on T4-medium ($0.60/hr):

522 turns → 468 train / 53 eval
Qwen3-1.7B, LoRA r=16 (17.4M params, 1% of base)
50 steps, 33 min → eval_loss 0.149, token accuracy 97.6%
3.44 GB merged model auto-pushed to private HF repo

Use the trained model anywhere:

DOER_PROVIDER=transformers DOER_MODEL=cagataydev/doer-qwen3-17b do "what is doer"

See the bundled hf_jobs/README.md (also accessible at $(doer --hf-jobs)/README.md after install) for full details, the generator (gen_dataset.py), and the three trainers (train_text_lora.py, train_vlm.py, train_omni.py).

share the dataset (HuggingFace)

pip install 'doer-cli[hf]'

do --upload-hf                       # → <user>/doer-training (private)
do --upload-hf cagataydev/my-data    # custom repo
do --upload-hf-public                # public dataset

Idempotent — one atomic commit per run (train.jsonl + README with schema/stats/sha). Reuses huggingface-cli login or HF_TOKEN. --train-status shows local sha vs last remote commit.

Round-trip anywhere:

hf download cagataydev/doer-training --repo-type dataset --local-dir /tmp/d
cp /tmp/d/data/train.jsonl ~/.doer_training.jsonl
do --train 200

robotics: Isaac GR00T (v0.8.0+)

doer speaks the Isaac GR00T policy-server protocol (ZMQ REQ/REP, msgpack) natively. Use it as the brain that plans robot actions, or as a pipe that converts observation JSON into action JSON — no LLM, no tokens, just stdin → action.

pip install 'doer-cli[gr00t]'   # adds pyzmq + msgpack + numpy + pillow

pipe mode — raw ZMQ client, LLM bypass:

echo '{"state.joint_pos":[0,0,0,0,0,0,0]}' | doer --gr00t "pick up cube"
# → {"action":{"action.joint_pos":[[0.01,...]]}, "info":{"inference_time_ms":42}}

brain mode — LLM calls gr00t_action as a tool:

export DOER_GR00T_HOST=thor.local
do "snap /tmp/cam.jpg with libcamera-still, then tell gr00t to stack red on blue"

auto-spawn the server (on a CUDA box):

doer --gr00t-serve /path/to/gr00t-n1.7-ckpt --embodiment-tag new_embodiment

Helpers: doer --gr00t-ping · doer --gr00t-schema · doer --gr00t-reset. Env knobs: DOER_GR00T_HOST / PORT / EMBODIMENT / TIMEOUT_MS / API_TOKEN. Every gr00t_action call lands in ~/.doer_training.jsonl as a native tool-use record — doer --train learns when to ask the policy server from the log.

philosophy

┌─────┐       ┌──────┐       ┌──────┐
│stdin│──────▶│  do  │──────▶│stdout│
└─────┘       └──────┘       └──────┘

grep with a brain. Chain it. Script it. Cron it.

Read SOUL.md for the manifesto. Read AGENTS.md for the rules.

family

project	size	purpose
doer	~730 LOC	one pipe, one shell, one file, one loop (collect→train→swap)
DevDuck	60+ tools	every protocol, every edge

uninstall

pipx uninstall doer-cli    # or: pip uninstall doer-cli
rm /usr/local/bin/do       # if installed via binary

license

Apache-2.0 · built in New York · 2026

do one thing and do it well — Doug McIlroy, 1978

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.8.0

May 2, 2026

0.7.1

Apr 20, 2026

0.7.0

Apr 20, 2026

0.6.1

Apr 20, 2026

0.6.0

Apr 20, 2026

0.4.0

Apr 20, 2026

0.3.0

Apr 19, 2026

0.2.1

Apr 19, 2026

0.2.0

Apr 19, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

doer_cli-0.8.0.tar.gz (49.9 kB view details)

Uploaded May 2, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

doer_cli-0.8.0-py3-none-any.whl (52.2 kB view details)

Uploaded May 2, 2026 Python 3

File details

Details for the file doer_cli-0.8.0.tar.gz.

File metadata

Download URL: doer_cli-0.8.0.tar.gz
Upload date: May 2, 2026
Size: 49.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.13

File hashes

Hashes for doer_cli-0.8.0.tar.gz
Algorithm	Hash digest
SHA256	`c45704ecb49c9297c6bc26a6ad01399b2182c0e2f50264a02565e37c59760933`
MD5	`6b3c2ba3544c33855371d34ac5571842`
BLAKE2b-256	`4f7c070cc7da093748df50698e016c68250c7b6e5753a8fdf95d7f7a2e5bc864`

See more details on using hashes here.

File details

Details for the file doer_cli-0.8.0-py3-none-any.whl.

File metadata

Download URL: doer_cli-0.8.0-py3-none-any.whl
Upload date: May 2, 2026
Size: 52.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.13

File hashes

Hashes for doer_cli-0.8.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`89e933c55c12b9116ef726d38d3fadbaef2a9cbae42138d8de2b3bfa8e0a9697`
MD5	`8e8391d0e909b50d7f1d157c18d21cb0`
BLAKE2b-256	`ad12da07bb0c9a9ab5ef786e6313e5fd2ef20b93627f211431240edf7531a622`

See more details on using hashes here.

doer-cli 0.8.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

DOER

stdin → agent → stdout

install

run

what it is

context it sees every call

providers

env knobs

extend in 60 seconds

the loop: collect → train → swap

train in the cloud (HuggingFace Jobs)

share the dataset (HuggingFace)

robotics: Isaac GR00T (v0.8.0+)

philosophy

family

uninstall

license

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`stdin → agent → stdout`