Governed runtime + skills for a multi-domain personal health agent (recovery, running, sleep, stress, strength, nutrition).
Project description
Health Agent Infra
Health Agent Infra is a local governance runtime for agentic personal-health
software. It turns a natural-language health agent into a bounded operator:
you talk to the agent, the agent invokes the local hai CLI, and
deterministic Python owns rules, validation, state, and commits.
It is both a working single-user package and a reference architecture for the code/skill split. It is not a chatbot or a hosted coaching app; it is the boundary that lets an LLM work over health data without owning the policy engine, the database, or the final write path.
What this is
A Claude Code agent is the intended operator. You ask it to check readiness,
log a gym session, explain yesterday's recommendation, or record how the day
went. The agent maps that request onto validated hai commands, reads a
governed snapshot, posts one bounded proposal per domain, and lets the runtime
commit the final plan atomically to local SQLite.
The core rule:
Skills never mutate actions; code never improvises coaching prose.
Python owns classification bands, R-rules, X-rules, schema validation, supersession, review linkage, and transactions. Markdown skills own rationale, uncertainty, clarification, and natural-language handoff back to the user.
The package stores state locally and has no telemetry path. Pull commands only call the configured data source, currently intervals.icu or Garmin Connect. If you drive the runtime with a hosted LLM agent, any context you send to that host is governed by that host's data policy; Health Agent Infra does not control the model provider.
For technical users who want the convenience of a conversational health agent without handing the model unchecked authority over personal health data.
What ships today
| Surface | Shipped shape |
|---|---|
| Domains | 6: recovery, running, sleep, stress, strength, nutrition |
| Skills | 14 packaged markdown skills, including intent-router and expert-explainer |
| CLI contract | 52 annotated hai commands with mutation class, idempotency, JSON mode, exit codes, and agent-safety metadata |
| State | 21 SQLite migrations, local-only by default |
| Synthesis | 10 X-rule evaluators across two phases, committed in one transaction |
| Verification | 2135 collected tests, 28 packaged deterministic eval scenarios |
Why it is different
- Natural-language front end, deterministic write path. The normal product loop is conversational, but every mutation routes through validated CLI commands and local transactions.
- Local-first runtime. State lives in SQLite under your home directory. No Health Agent Infra account, no daemon, no hosted backend.
- Governed, not generative. Python owns deterministic policy; skills only narrate and ask for clarification over already-constrained actions.
- Agent-native by contract.
hai capabilities --jsonexposes every subcommand's mutation class, idempotency, JSON behavior, exit codes, and agent-safe flag. Theintent-routerskill maps natural-language intent to that contract, so the CLI is the agent's tool surface rather than a list of commands the user must memorize. - Auditable by construction. Pulls, accepted state, proposals, X-rule
firings, final recommendations, and review outcomes persist in typed
tables. Inspect with
hai today,hai explain --operator,hai doctor, andhai stats; these surfaces reconcile supersede chains and hide schema churn that raw SQL will not.
v0.1.9 closed a focused hardening review on top of the v0.1.8 four-round Codex audit baseline. The release-by-release audit index is in AUDIT.md.
What the loop looks like
User: "Plan today, but I slept badly and my legs feel heavy."
Agent: Reads `hai capabilities`, logs the readiness note, runs `hai daily`.
Runtime: Pulls evidence, projects state, classifies six domains, applies R-rules.
Agent: Invokes domain skills and posts `DomainProposal` rows with `hai propose`.
Runtime: Applies X-rules, commits the plan atomically, schedules review.
User: Reads `hai today`; asks "why did you soften the run?"
Agent: Runs `hai explain --operator` and answers from persisted rows.
How the daily loop completes
hai daily is the orchestrator the agent drives. It does not finish the
full judgment loop in one call:
pullfetches evidence from the configured source and writes async_run_logrow for freshness telemetry.cleannormalizes evidence into typed accepted-state rows. v0.1.9 makes this fail-closed: a DB projection failure exits non-OK rather than silently leaving downstream callers with stale state.snapshotbuilds the per-domain bundle, withclassified_stateandpolicy_resultpopulated on every domain regardless of whether the caller passed an evidence bundle.gapsenumerates user-closeable intake gaps.proposal_gatereportsawaiting_proposals,incomplete, orcomplete.
When the gate is not complete, the agent invokes the per-domain readiness
skills, posts one DomainProposal per expected domain with
hai propose --domain <d>, then re-runs hai daily. --domains <csv>
narrows the expected set for partial-day runs. Direct hai synthesize
enforces the same six-domain completeness gate by default — pass
--domains '' to opt out (rare; matches pre-v0.1.9 permissive behavior).
The full contract is in
reporting/docs/agent_integration.md.
Install
The commands below are the agent-operable surface. You can run them by hand,
but the intended daily loop is natural language first: tell the agent what
you want, let it inspect hai capabilities, and let it invoke the right
validated command.
pipx install health-agent-infra # or: pip install -e .
hai init # scaffolds state + config + skills
hai auth intervals-icu # preferred live source
hai daily # orchestrates pull -> clean ->
# snapshot -> gaps -> proposal gate;
# the agent then posts proposals
hai today # read today's plan in plain language
--source defaults to intervals_icu when credentials are configured, else
csv for the committed fixture. Garmin Connect live scraping remains
best-effort and rate-limited; use --source garmin_live only when you
explicitly want it. The shortcut hai init --with-auth --with-first-pull
exists, but in v0.1.9 it is the Garmin-first-pull wizard, not the
intervals.icu setup path.
On macOS, credentials are stored in the OS keyring. The first hai pull
may ask for access; choose Always Allow if you want scripted runs such
as hai daily to continue without hanging on a prompt.
Full agent wiring notes live in
reporting/docs/agent_integration.md.
Where your data lives
Everything the runtime stores stays on your machine. Three locations matter:
| What | Default path | Override |
|---|---|---|
| State DB | ~/.local/share/health_agent_infra/state.db |
$HAI_STATE_DB, --db-path |
| Intake / proposal JSONL | ~/.health_agent/ |
$HAI_BASE_DIR, --base-dir |
Config (thresholds.toml) |
macOS: ~/Library/Application Support/hai/; Linux: ~/.config/hai/ |
hai config init --path <p> |
Run hai doctor to confirm resolved paths, schema version, source
freshness, and skill installation status. It also warns when the applied
migration set has gaps even if MAX(version) looks current.
Reading your plan
hai today is the non-agent-mediated user surface. It resolves supersede
chains and renders the canonical plan for a date:
hai today # today, markdown on TTY / plain elsewhere
hai today --as-of 2026-04-23 # specific date
hai today --domain recovery # narrow to one domain
hai today --format json # machine-readable
For dense audit output, use hai explain --operator or hai explain.
Both reconstruct the plan from persisted rows; they do not recompute the
runtime state.
Recording your day
After the next day's run schedules review events, record how yesterday went:
hai review record --outcome-json <path>
hai review summary [--domain recovery]
Outcomes are append-only and re-link when a plan has been superseded. If
you recorded an outcome against the morning plan but re-authored the day
after lunch, hai review record routes the outcome to the canonical leaf's
matching-domain recommendation.
followed_recommendation and self_reported_improvement must be strict
booleans (true / false), not "yes", 1, or truthy strings.
Manual intake lives under:
hai intake gym|exercise|nutrition|stress|note|readiness ...
Nutrition is a daily total, not per-meal. Re-calling within the same day creates a supersede chain; log it once at the end of the day.
Six domains in v1
recovery - running - sleep - stress - strength - nutrition
Each domain ships schemas, classification bands, policy rules, and a
readiness skill. Synthesis reconciles proposals through 10 X-rule
evaluators across two phases. Nutrition is macros-only in v1; see
reporting/docs/non_goals.md.
Calibration timeline
A fresh install can produce recommendations on day one, but several signals need history before they carry much meaning:
| Window | What works |
|---|---|
| Days 1-14 | Cold-start mode for running, strength, and stress. Expect to review flags consciously. |
| Day 14 | Cold-start window closes. HRV and RHR rolling baselines begin to stabilize. |
| Days 14-28 | Recovery, sleep, and stress become more calibrated against trailing-7d trend. |
| Day 28 | ACWR's chronic-load denominator is full. Strength volume_ratio stops mechanically reading as 4x. |
| Day 60+ | Trend bands start carrying real signal. |
| Around day 90 | Steady state; remaining uncertainty is structural rather than history-bounded. |
Code-derived marker: COLD_START_THRESHOLD_DAYS = 14 in
src/health_agent_infra/core/state/snapshot.py. Cold-start relaxation is
asymmetric by design: running, strength, and stress can soften some
coverage blocks; recovery, sleep, and nutrition do not. Nutrition keeps
deferring on insufficient evidence rather than relaxing into a
low-confidence guess.
Permanent caveats:
- intervals.icu does not expose sleep efficiency, body battery, or Garmin all-day stress.
- v1 nutrition is macros-only, so micronutrient coverage is unavailable at source.
What the system refuses to do
- No medical claims or diagnosis-shaped language.
- No autonomous training-plan or diet-plan generation.
- No state mutation without the relevant validated CLI path and, for agent-proposed intent/target activation, explicit user commit.
- No package telemetry or hosted state backend.
- No skill-side arithmetic for bands, scores, R-rules, or X-rules.
Full scope boundaries are in
reporting/docs/non_goals.md and
SECURITY.md.
CLI surface
This is the contract an agent operates after translating user intent from natural language. Humans can use it directly for setup, debugging, and audit; the normal product loop is still conversational.
# Evidence + intake
hai pull [--source intervals_icu|garmin_live|csv] --date <d>
hai clean --evidence-json <p>
hai intake gym|exercise|nutrition|stress|note|readiness ...
# Agent flow
hai daily [--domains <csv>]
hai propose --domain <d> --proposal-json <p>
hai synthesize --as-of <d> --user-id <u>
hai synthesize --bundle-only
# State + audit
hai state init | migrate | read | snapshot | reproject [--cascade-synthesis]
hai capabilities [--json | --markdown]
hai explain --for-date <d> --user-id <u>
hai today | hai doctor | hai stats
# Review, memory, intent, targets
hai review schedule | record | summary
hai memory set | list | archive
hai intent training add-session | training list | sleep set-window | list | archive
hai target set | list | archive
# Ops + research + evals
hai auth intervals-icu | garmin | status
hai config init | show | validate | diff
hai planned-session-types
hai research topics | search --topic <t>
hai eval run --domain <d> | --synthesis [--json]
hai setup-skills
The authoritative surface is hai capabilities --markdown or
reporting/docs/agent_cli_contract.md.
Roadmap
Now / Next / Later lives in ROADMAP.md. The detailed,
audited release plan is
reporting/plans/multi_release_roadmap.md.
Where to read next
- ARCHITECTURE.md - one-page architecture
- AUDIT.md - release audit index
- HYPOTHESES.md - five falsifiable hypotheses
- ROADMAP.md - Now / Next / Later
- CONTRIBUTING.md - code-vs-skill contribution rules
- REPO_MAP.md - every top-level entry classified
- SECURITY.md - vulnerability reporting and scope of trust
reporting/docs/architecture.md- full pipelinereporting/docs/non_goals.md- scope disciplinereporting/docs/x_rules.md- X-rule cataloguereporting/docs/tour.md- 10-minute reading tour
Citing this work
See CITATION.cff. If you are writing about the project's claims rather than the package itself, use HYPOTHESES.md as the canonical statement of the bets and their falsification criteria.
License
MIT. See LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file health_agent_infra-0.1.9.tar.gz.
File metadata
- Download URL: health_agent_infra-0.1.9.tar.gz
- Upload date:
- Size: 421.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
49f4e005d6ade31ea633fd76a9e88941a4ff2a977f8b67d503da9b113d8c6dd1
|
|
| MD5 |
2dbac953385beb35bbd4f68a1c30ce1d
|
|
| BLAKE2b-256 |
a300e50564e04562e8d4e4f4d972c54c11d2fddd3ee50e34b10a9e5c40390bf7
|
File details
Details for the file health_agent_infra-0.1.9-py3-none-any.whl.
File metadata
- Download URL: health_agent_infra-0.1.9-py3-none-any.whl
- Upload date:
- Size: 512.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6ef37c4ab80fa3baed00332e0ff1d842702cc4069c8229a894ad776895384c5c
|
|
| MD5 |
d1039f17cfca41c3cd2c02b0db5e6dd9
|
|
| BLAKE2b-256 |
8d54a6c69bb6352f0d892fa1928a982f89b5d73388313941c0105bdbaa5bc69f
|