A backend-agnostic sensorimotor protocol — eyes and hands for cognitive agents driving a computer.
Project description
afferent
A backend-agnostic sensorimotor protocol — eyes and hands for cognitive agents driving a computer.
A cognitive layer (a brain) plans; an embodiment layer (a body) acts.
afferent is the conduit between them. It carries afferent signals up
(eyes — observe / locate / verify / read_text) and efferent
signals down (hands — click / type_text / key / scroll), as typed,
safety-gated calls over a pluggable backend.
┌─────────┐ afferent (eyes) ↑ ┌────────────┐ actions ┌──────────┐
│ brain │ ◀───────────────────── │ afferent │ ──────────▶ │ body │
│ (plans) │ ──────────────────────▶│ (protocol) │ ◀────────── │ (backend)│
└─────────┘ efferent (hands) ↓ └────────────┘ observations└──────────┘
The core is dependency-free (stdlib only). It ships three backends —
FakeBackend (scripted, hardware-free), MacOSBackend (drives the host Mac
via screencapture + cliclick), and PiHidBackend (drives a remote
machine through a Bluetooth-HID gateway) — plus a Backend ABC you subclass
to drive any other body. The protocol doesn't care which.
Why it exists
Most computer-use agents fuse perception, planning, and action into one
monolith. afferent deliberately splits the body from the mind with a
narrow, typed seam, so:
- the planner stays free to be anything (an LLM loop, a cognitive architecture, a script);
- the body stays free to be anything (a real desktop, a browser, a VM, a fake);
- and the whole loop is unit-testable offline via the scripted fake backend — no hardware, no network, no API keys.
Install
pip install afferent
That's it — no dependencies. (Dev tooling: pip install afferent[dev].)
Quickstart — offline, scripted body (works immediately)
from afferent import Embodiment
from afferent.types import Observation, VisualElement
screen0 = Observation(
ts=0.0, frontmost_app="Firefox",
elements=[VisualElement("Run", (0.80, 0.20, 0.10, 0.04), kind="button")],
)
screen1 = Observation(ts=1.0, frontmost_app="Firefox", ocr_text="running…")
em = Embodiment.fake(script=[screen0, screen1]) # read_only=False for the demo
print(em.observe().render_text()) # afferent: see the screen
res = em.click("Run") # efferent: locate + click
print(res.ok, res.steps, res.state_after.ocr_text) # grounded outcome
Quickstart — live, your Mac
from afferent import Embodiment
# Eyes only by default (read_only=True) — zero blast radius.
em = Embodiment.macos()
print(em.capabilities()) # {'pixels','click','type','key'} if cliclick installed
print(em.observe().render_text()) # frontmost app + screenshot frame
# Opt into hands, gated by a confirm callback you control:
em = Embodiment.macos(read_only=False, confirm=lambda d: input(f"{d}? [y/N] ") == "y")
em.click_at(0.5, 0.5)
Eyes use the built-in screencapture (grant Screen Recording); hands use
cliclick (brew install cliclick, grant
Accessibility). Missing tools degrade gracefully — capabilities() reflects
what's actually available.
Driving a remote machine — the BT HID gateway
MacOSBackend drives the host it runs on. To drive a different computer —
one you can't run code on — afferent ships a Bluetooth-HID body: a Raspberry
Pi bonded to one or more targets like a multi-device keyboard/mouse, exposing a
REST API. The consumer side is stdlib-only:
from afferent import Embodiment, PiHidBackend
# Pin to one target by its Bluetooth MAC; several can stay connected at once
# and only the addressed machine receives input.
be = PiHidBackend(base_url="http://10.0.0.2:8080",
host_mac="84:2F:57:7D:85:21")
em = Embodiment(be, read_only=False)
em.key("cmd+tab") # app switch on that machine
em.type_text("hello\n") # types only on that host
be.client.set_active_host(...) # or route unaddressed calls
A gateway is hands without eyes — it sends relative motion and key/text
reports, so type_text / key / scroll work directly, but absolute
click_at(x_pct, y_pct) needs a homer= (a visual servo that watches the
screen and drives the cursor to the target). Inject one if your consumer has
eyes; otherwise pct clicks return ok=False with a clear reason.
Pi side (pip install afferent[gateway], runs the L2CAP multi-host HID
server + REST gateway):
afferent-gateway # serves http://0.0.0.0:8080
See scripts/afferent-gateway.service for a systemd unit and
scripts/macos-devmouse-autoconnect.sh for a macOS agent that keeps a target
auto-reconnected like a real Bluetooth mouse (--install, --pause, --status).
The protocol
All coordinates are pct — fractions in [0, 1], top-left origin,
resolution-independent (so they're stable world-model keys across machines).
Typed results (afferent.types): Frame, VisualElement, Observation,
LocateResult, VerifyResult, ActionResult.
Observation.render_text() is a stable, compact, embeddable one-screen
string — feed it to an embedding model and use it as a key in a learned world
model. Determinism is guaranteed (same observation → byte-identical string).
ActionResult carries grounding for predictive-coding / world-model
consumers: steps (e.g. visual-servo iterations), duration_ms,
final_cursor_pct, frame_before / frame_after, and a state_after
observation bracketing the action.
Safety
SafetyGate sits in front of every efferent action (eyes are never gated):
read_only=Trueis the default — hands refuse until you opt in.confirm(desc) -> bool— a per-action veto your planner drives.allowed_apps— refuse when the frontmost app isn't allowed.max_actions_per_min— rate limit against runaway loops.panic()— latch into a permanent refusing state.
This is additive to whatever gates a backend enforces internally. Both must pass.
Writing a backend
Subclass afferent.Backend, implement the eyes (observe, optionally
locate / verify / read_text) and the raw hands (do_click_at,
do_type_text, do_key, optionally do_move_to / do_scroll), and declare
capabilities(). Embodiment applies the SafetyGate and the post-action
observation for you — a backend only answers "how do I see / move", never
"should I".
from afferent import Backend, Embodiment
from afferent.types import Observation, ActionResult
class MyBackend(Backend):
name = "mybody"
def capabilities(self):
return {"pixels", "click", "type", "key"}
def observe(self, *, ocr=False, locate=None) -> Observation:
... # capture your screen → Observation
def do_click_at(self, x_pct, y_pct, button, count) -> ActionResult:
... # drive your mouse; return ActionResult(ok=True, ...)
def do_type_text(self, text, secret, append_enter) -> ActionResult:
...
def do_key(self, combo) -> ActionResult:
...
em = Embodiment(MyBackend(), read_only=False)
FakeBackend (in afferent/backends/fake.py) is a complete, readable
reference implementation of the contract.
Develop
pip install -e ".[dev]"
python -m unittest discover -s tests -v # fully offline, no deps
# or: pytest
Releasing
Publishing is automatic. Bump __version__ in afferent/__init__.py,
commit, and push to main — .github/workflows/publish.yml builds, tests,
and publishes to PyPI via Trusted Publishing (no tokens). Pushes that don't
change the version are a no-op (the workflow checks PyPI and skips).
One-time setup is in the workflow header (add a "pending publisher" on PyPI).
For a manual / TestPyPI publish, use the local script:
scripts/release.sh --test # TestPyPI
scripts/release.sh # PyPI
License
MIT.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file afferent-0.3.0.tar.gz.
File metadata
- Download URL: afferent-0.3.0.tar.gz
- Upload date:
- Size: 49.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
09d69eff99a51259a95d39b32018df3d3c4bf57c7406cbb078338feebe264fc5
|
|
| MD5 |
180af20ebfa8ab7176685a399ac2149f
|
|
| BLAKE2b-256 |
2736e41d47e694c9bcb4d7534a0296b7016dd56fcf6447e93c87a2b4f374f763
|
Provenance
The following attestation bundles were made for afferent-0.3.0.tar.gz:
Publisher:
publish.yml on andrasfe/spinalcord
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
afferent-0.3.0.tar.gz -
Subject digest:
09d69eff99a51259a95d39b32018df3d3c4bf57c7406cbb078338feebe264fc5 - Sigstore transparency entry: 1809704825
- Sigstore integration time:
-
Permalink:
andrasfe/spinalcord@e089ffa9d1a9779df897e2a5b8e001bd98c8ea61 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/andrasfe
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@e089ffa9d1a9779df897e2a5b8e001bd98c8ea61 -
Trigger Event:
push
-
Statement type:
File details
Details for the file afferent-0.3.0-py3-none-any.whl.
File metadata
- Download URL: afferent-0.3.0-py3-none-any.whl
- Upload date:
- Size: 45.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8280d320ac9ad1f7a23bbecb5fdff3a571ebea0df0f8432d81b9463e9e95e207
|
|
| MD5 |
a02881937e70c94ebedecd8a662f4bd1
|
|
| BLAKE2b-256 |
44f5ea1a3d23f908fd5467b8263e837f8824aeafc92ceebe47f21e3687c3a13f
|
Provenance
The following attestation bundles were made for afferent-0.3.0-py3-none-any.whl:
Publisher:
publish.yml on andrasfe/spinalcord
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
afferent-0.3.0-py3-none-any.whl -
Subject digest:
8280d320ac9ad1f7a23bbecb5fdff3a571ebea0df0f8432d81b9463e9e95e207 - Sigstore transparency entry: 1809704847
- Sigstore integration time:
-
Permalink:
andrasfe/spinalcord@e089ffa9d1a9779df897e2a5b8e001bd98c8ea61 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/andrasfe
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@e089ffa9d1a9779df897e2a5b8e001bd98c8ea61 -
Trigger Event:
push
-
Statement type: