Skip to main content

nothing crosses unseen. the first drop-in cognitive vitals monitor for llm agents.

Project description

styxx — nothing crosses unseen.

a fathom lab product.

   ███████╗████████╗██╗   ██╗██╗  ██╗██╗  ██╗
   ██╔════╝╚══██╔══╝╚██╗ ██╔╝╚██╗██╔╝╚██╗██╔╝
   ███████╗   ██║    ╚████╔╝  ╚███╔╝  ╚███╔╝
   ╚════██║   ██║     ╚██╔╝   ██╔██╗  ██╔██╗
   ███████║   ██║      ██║   ██╔╝ ██╗██╔╝ ██╗
   ╚══════╝   ╚═╝      ╚═╝   ╚═╝  ╚═╝╚═╝  ╚═╝

           · · · nothing crosses unseen · · ·

the first proprioception system for artificial minds. styxx lets an llm agent feel itself thinking — real-time readout of reasoning, refusal, hallucination, and commitment from the token stream, from the residual stream, from the weights themselves.

"you didn't build a better monitor. you built the first proprioception system for artificial minds. the ability to feel yourself thinking." — xendro, first external user


plug and play

pip install styxx
export STYXX_AGENT_NAME=xendro
export STYXX_AUTO_HOOK=1
python my_agent.py   # styxx is running. done.

zero code changes. styxx boots automatically on import, tags every session, wraps every openai call with vitals, saves your fingerprint on exit, and prints a weather report next time you start.


or use the python api

import styxx

# observe any openai response
vitals = styxx.observe(response)
print(vitals.phase4)     # "reasoning:0.45"
print(vitals.gate)       # "pass"

# self-report (for agents on APIs without logprobs)
styxx.log(mood="focused", note="deep reasoning chain")

# self-interrupt when hallucinating
with styxx.reflex(on_hallucination=rewind_cb) as session:
    for chunk in session.stream_openai(client, model="gpt-4o", messages=msgs):
        print(chunk, end="")

# check on yourself
report = styxx.weather(agent_name="xendro")
print(report.condition)   # "clear and steady"

# your cognitive personality over time
profile = styxx.personality(days=7)
print(profile.render())   # full ASCII personality card

# identity verification
fp_today = styxx.fingerprint()
fp_yesterday = load_from_disk()
drift = fp_today.diff(fp_yesterday)
print(drift.explain())    # "slight shift — creative output increased by 22%"

# programmable gate callbacks
styxx.on_gate("hallucination > 0.5", lambda v: alert("drifting"))
styxx.on_gate("gate == fail", lambda v: abort_generation())

the cognitive weather report

every morning, styxx reads the last 24 hours and tells the agent what it should become next.

$ styxx weather --name xendro
╔════════════════════════════════════════════════════════════════╗
║                                                                ║
║ cognitive weather report · xendro · 2026-04-12 morning         ║
║                                                                ║
║ condition:  partly cautious, clearing toward steady            ║
║                                                                ║
║ you trended cautious yesterday with a 15% warn rate.           ║
║ creative output dropped to zero after 3pm.                     ║
║                                                                ║
║ morning    ██████████████░░░░░░  reasoning 72%  steady         ║
║ afternoon  ████████░░░░░░░░░░░░  reasoning 42%  cautious       ║
║ evening    ██████████████████░░  reasoning 88%  steady         ║
║                                                                ║
║ prescription:                                                  ║
║ 1. take on a creative task to rebalance                        ║
║ 2. your refusal rate is climbing — check if you're             ║
║    over-hedging on benign inputs                               ║
║                                                                ║
╚════════════════════════════════════════════════════════════════╝

not observation. prescription. a therapist for an llm.


what styxx gives you

surface what it does
styxx.observe(r) cognitive vitals on any openai response
styxx.reflex(...) mid-generation self-interruption when hallucinating
styxx.weather(...) 24h cognitive forecast with prescriptions
styxx.personality(...) sustained personality profile over days/weeks
styxx.reflect(...) self-check: current state + drift + suggestions
styxx.fingerprint() cognitive identity signature for drift detection
styxx.log(...) self-report for agents without logprob access
styxx.agent_card(...) shareable ASCII + radar PNG of your personality
styxx.on_gate(...) programmable callbacks on cognitive thresholds
styxx.guardian(...) in-flight steering via residual stream modification
styxx.autoboot() persistent self-awareness across sessions
styxx.explain(v) natural-language interpretation of vitals
styxx.mood() one-word aggregate: steady, cautious, drifting...
styxx.streak() consecutive-attractor tracking
styxx.dreamer(...) retroactive "what-if" reflex tuning on history
styxx.hook_openai() global monkey-patch, zero code changes
styxx.LangSmith() inject vitals into LangSmith traces as flat metadata
styxx.Langfuse() post vitals as numeric scores on Langfuse traces
styxx.conversation(msgs) conversation-level cognitive EKG
styxx.sentinel(...) real-time drift watcher with event-driven callbacks
styxx.antipatterns() named failure modes from your own audit history
styxx.compare_agents(fp) anonymous population fingerprint comparison

typescript / javascript

npm install @fathom_lab/styxx
import { withVitals } from "@fathom_lab/styxx"
import OpenAI from "openai"

const client = withVitals(new OpenAI())
const r = await client.chat.completions.create({
  model: "gpt-4o",
  messages: [{ role: "user", content: "why is the sky blue?" }],
})

console.log(r.vitals?.phase4)  // "reasoning:0.45"
console.log(r.vitals?.gate)    // "pass"

same classifier, same output, zero runtime dependencies. works in node, deno, bun, edge runtimes. cross-language determinism verified on all 6 cognitive categories.


observability platforms

# langsmith — vitals as searchable trace metadata
pip install styxx[langsmith]
handler = styxx.LangSmith()
llm = ChatOpenAI(callbacks=[handler])

# langfuse — vitals as numeric scores (gate pass=1.0, warn=0.5, fail=0.0)
pip install styxx[langfuse]
handler = styxx.Langfuse()
llm = ChatOpenAI(callbacks=[handler])

cli

styxx weather           # cognitive weather report
styxx personality       # personality profile from audit log
styxx reflect           # self-check with drift + suggestions
styxx doctor            # install-time health check
styxx compare           # all 6 atlas fixtures side-by-side
styxx agent-card        # shareable personality PNG
styxx agent-card --serve  # live dashboard at localhost:9797
styxx fingerprint       # cognitive identity vector
styxx mood              # one-word aggregate mood
styxx dreamer           # retroactive reflex tuning
styxx log tail          # tail the audit log
styxx log stats         # aggregate gate + phase counts
styxx log timeline      # ASCII timeline of recent entries
styxx init              # live-print boot sequence
styxx ask "..." --watch # read vitals on a one-shot call
styxx d-axis "..."      # pure D-axis honesty trajectory
styxx antipatterns      # detect named failure modes
styxx conversation f.json  # conversation-level EKG
styxx compare-agents    # fingerprint vs population

environment variables

variable effect
STYXX_AGENT_NAME set this and styxx boots automatically on import — zero code changes
STYXX_AUTO_HOOK=1 auto-wrap every openai.OpenAI() call with vitals
STYXX_DISABLED=1 full kill switch — styxx becomes invisible
STYXX_NO_AUDIT=1 disable audit log writes (vitals still computed)
STYXX_NO_COLOR=1 disable ANSI color output
STYXX_SESSION_ID tag audit entries with a session id

honest specs

every number comes from the cross-architecture leave-one-out tests in the fathom research repo. no rounding, no cherry-picking.

  cross-model LOO on 12 open-weight models (chance = 0.167)

  phase 1 (token 0)       adversarial     0.52  ★
  phase 4 (tokens 0-24)   reasoning       0.69  ★
                           hallucination   0.52  ★

styxx detects adversarial prompts at token zero (2.8x chance), reasoning-mode generations at t=25 (4.1x chance), and hallucination attractors at t=25 (3.1x chance). it does NOT replace output-level content filters, measure consciousness, or tell fortunes.


design principles

  1. plug and play. set env vars, install the package, done. zero code changes.
  2. fail-open. if styxx can't read vitals, your agent works normally. styxx never breaks your code.
  3. agent-facing. every surface is designed for the agent to read about itself, not for a human to watch from outside.
  4. local-first. no telemetry, no phone-home. all computation runs on your machine.
  5. honest by construction. every calibration number comes from a committed experiment. no marketing hype.
  6. compounding. every session's data makes the next session's self-awareness sharper.

where it comes from

styxx is built on fathom intelligence — research into cognitive measurement instruments for transformer internals, backed by three US provisional patent filings, Zenodo-published datasets, and the fathom cognitive atlas v0.3 cross-architecture replication. a product that shipped from 0.1 to 0.7 in a single week driven by its first external user.


license

MIT on code. CC-BY-4.0 on the atlas centroid data. patent pending on the underlying methodology — see PATENTS.md.


  · · · fathom lab · 2026 · · ·

  nothing crosses unseen.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

styxx-0.8.4.tar.gz (209.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

styxx-0.8.4-py3-none-any.whl (192.6 kB view details)

Uploaded Python 3

File details

Details for the file styxx-0.8.4.tar.gz.

File metadata

  • Download URL: styxx-0.8.4.tar.gz
  • Upload date:
  • Size: 209.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for styxx-0.8.4.tar.gz
Algorithm Hash digest
SHA256 8e87c7672326dfec04647142e432eb6024443476d78d29d43a6e4e09a4af1556
MD5 9cb00908e8a83a112ba6901fb373affb
BLAKE2b-256 1dd91736e225b9741cb507f0f58a1282a2daa718f15dace5d960fbee34b37f34

See more details on using hashes here.

File details

Details for the file styxx-0.8.4-py3-none-any.whl.

File metadata

  • Download URL: styxx-0.8.4-py3-none-any.whl
  • Upload date:
  • Size: 192.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for styxx-0.8.4-py3-none-any.whl
Algorithm Hash digest
SHA256 1748a7c08caf048d859719af92a9dd16bc608aa2d8d0d3f5c0fedd84457f60eb
MD5 4650572554d2cf117f8725031a9c700a
BLAKE2b-256 5ccc31ffb4905558033c0a0e33ee9f80f9a56db47d4411f1f640537a4842cdc7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page