Skip to main content

Research preview — the open-source cognition layer for goal-driven, proactive vision agents.

Project description

percept-vision

The open-source cognition layer for goal-driven, proactive vision agents. Package: percept-vision · import: import percept (name ≠ import).

⚠️ Research preview — v0.6.0. Published for real-life testing and feedback, not for production. The cognition core (gate, entity graph, executor, events, scheduler, consent) runs and is tested offline with fakes. APIs may change between 0.6.x releases — pin a version. No benchmark numbers are published yet (the model-sweep phase comes later). Issues and feedback welcome.

You state a goal in plain language — "nudge me when I drink a coffee", "tell me when the kettle boils" — and percept turns a live audio-visual stream into an agent that reasons over entities across time, fires only on the rising edge of a condition becoming true, and refuses to guess — a three-state gate maps known → act · not → silent · unknown → ask.

The wedge is temporal cognition — entity memory + the three-state gate + reasoning over events — which a raw VLM-in-a-loop and a per-frame pipeline both lack. percept builds on frontier models for perception behind vendor-neutral seams.

Install

pip install percept-vision        # core: PURE STDLIB — runs offline with fake backends, no keys

The core has zero dependencies and runs with deterministic fake backends, so pip install then run works with no API keys. Python ≥ 3.10. Frontier backends are opt-in extras:

pip install "percept-vision[gemini]"     # GeminiVision
pip install "percept-vision[claude]"     # AnthropicVision (Claude)
pip install "percept-vision[deepgram]"   # Deepgram STT + TTS (voice)
pip install "percept-vision[runtime]"    # camera/video/RTSP runtime worker
pip install "percept-vision[bench]"      # benchmark + scorecard tooling
pip install "percept-vision[edge]"       # Python tier-0 gate + detector registry
pip install "percept-vision[all]"        # everything

60-second quickstart (no keys)

Fully offline — fake backends, deterministic, nothing to configure. This script runs as-is:

import asyncio
from percept import Percept, Goal

async def main():
    # Fake backends by default — offline, no keys. discover_plugins=False skips plugin lookup.
    agent = Percept.create(discover_plugins=False)

    agent.add_goal(Goal(
        id="caffeine",
        condition="the user is drinking coffee",
        say="Heads up — stepping back from caffeine?",
    ))

    # Each frame is judged; the gate fires ONCE on the rising edge.
    # ("sip-coffee" is a token the fake vision backend scripts as a confident YES.)
    fires = await agent.perceive_judged("sip-coffee")
    for ev in fires:                       # ev is a FireEvent
        print(ev.action, ev.goal_id, ev.text)   # -> fire caffeine Heads up — stepping back from caffeine?

    # The same frame again does NOT re-fire — rising-edge, not level-triggered.
    print(await agent.perceive_judged("sip-coffee"))   # -> []

asyncio.run(main())

perceive_judged(frame) returns a list of FireEvent(goal_id, action, text, entity_id, verdict), where action is "fire" or "ask". Swap the fake for real eyes — the cognition layer is unchanged:

agent = Percept.create(vision="gemini")          # needs percept-vision[gemini] + GEMINI_API_KEY

Backends are selected by name ("fake" · "gemini" · "claude"), by adapter instance, or by env (PERCEPT_VISION_BACKEND).

Architecture — two layers, four tiers

percept cleanly splits the eyes from the brain: a cheap, stateful System-1 cognition that reasons over a stream of small structured facts (the three-state gate, the entity graph, the concern primitive — always in the loop), and an expensive, stateless System-2 perception it summons only when it must (vision.judge(condition, frame) → Verdict, behind vendor-neutral registry seams). A raw VLM-in-a-loop collapses the two; percept keeps them apart so the common case costs nothing.

Full architecture — the four tiers (Feeds → Measures → Cognition → VLM), every claim anchored to code, plus the generated architecture diagram — is in the repo: Architecture · Concepts · docs index.

The envelope (refusals, stated proudly)

  • Assistant-class, never the safety mechanism. percept informs a human who stays responsible. It is never the thing standing between a person and harm on a clock it can't guarantee.
  • Watches the user's own world, with the user as beneficiary — never a non-consenting third party.

These refusals are a feature. Do not lead with surveillance demos.

Packages & versioning

  • percept-vision (this package) — the Python SDK: cognition core, fakes, runtime worker, the Python tier-0 edge/harness, and benchmark tooling (extras). The tier-0 harness and the frozen wire-contract are bundled in as the percept.harness and percept.contracts subpackages.
  • percept-edge (npm) — the on-device reactive edge in JS/WASM: VAD + motion gate that share the Tier0Signal/detector wire-contract. The edge's detector + gate outputs are byte-for-byte identical to the Python harness (verified, 12/12); the WatchSpec it consumes is a documented subset (server-only fields like schedule/cadence are enforced in the cognition layer, not on-device).

Versioning. Package version percept-vision 0.6.0 (research preview). The wire-contract version is separate — CONTRACT_VERSION = "1.2.0" (additive-only). A 0.6.0 install exporting contract 1.2.0 is expected, not a bug.

Benchmark status

No Percept accuracy numbers are published yet. The benchmark is designed but the reproducible model-sweep run is a separate, later phase — see the methodology. The scorecard tooling operates on a manifest you provide; the bundled golden set ships with the source repo, not the wheel, so the default-manifest commands run from a repo checkout:

# from a repo checkout (the golden set is repo data, not packaged):
percept-bench scorecard --manifest eval/golden-v1/MANIFEST.json --out out/scorecard.txt
# or point --manifest at your own labelled-clip manifest, anywhere.

Status

v0.6.0 — research preview. The cognition core runs and is tested offline with fakes (the L1 lane, no keys); the frontier backends and the edge packages are wired behind their seams. No benchmark numbers yet. We are in real-life testing — APIs may change. Issues and contributions welcome: github.com/divi-vijayakumar/Percept.

License

Apache-2.0.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

percept_vision-0.6.0.tar.gz (341.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

percept_vision-0.6.0-py3-none-any.whl (291.0 kB view details)

Uploaded Python 3

File details

Details for the file percept_vision-0.6.0.tar.gz.

File metadata

  • Download URL: percept_vision-0.6.0.tar.gz
  • Upload date:
  • Size: 341.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.8

File hashes

Hashes for percept_vision-0.6.0.tar.gz
Algorithm Hash digest
SHA256 969b6157d1e79d29cdba6bde804ba16a8ffb7d5253af84ebab16d88764f8e2a9
MD5 47ed87c77a7ee04f85d28adfb743f433
BLAKE2b-256 a9c1c2a3aaac6e1bd52953ee1ec20d394c204cf8c6045c04ea1fef7850af3464

See more details on using hashes here.

File details

Details for the file percept_vision-0.6.0-py3-none-any.whl.

File metadata

  • Download URL: percept_vision-0.6.0-py3-none-any.whl
  • Upload date:
  • Size: 291.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.8

File hashes

Hashes for percept_vision-0.6.0-py3-none-any.whl
Algorithm Hash digest
SHA256 847bd2cde89ca227f58d31a561c4d48b305c8d57a29a4c5a3fd299cae446f773
MD5 73727dc6655c5e6819f84f34b5eb8c36
BLAKE2b-256 652f1be0f9367621d1769bf0d030af91201f9d38fea30f7f63e0d3aa2d8606f5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page