Skip to main content

Windows personal ops agent with pluggable providers

Project description

home-agent

Windows-friendly personal operations agent for triaging email and calendar items with:

  • provider plugins
  • deterministic urgency rules
  • human-reviewed memory rules
  • Claude/Codex classification
  • semantic retrieval over historical items
  • swappable vector search backends (sqlite-vec or Qdrant)
  • SQLite-backed observability for costs, runs, and retrieval traces

Why this project exists

home-agent is built around a simple problem: inboxes and calendars contain operational risk, but most personal automation tools are either brittle rule engines or opaque LLM demos.

This project aims for a middle ground:

  • deterministic rules handle obvious signals
  • reviewed memory rules capture repeated patterns
  • retrieval finds semantically similar historical items
  • the LLM gets compact prior context before making the final urgency call

The result is a small local-first AI system that is easier to reason about than a generic agent loop.

Architecture

Current classification flow:

  1. provider plugins collect recent email and calendar items
  2. rule scoring assigns a first-pass urgency score
  3. approved memory rules can boost priority deterministically
  4. items are rendered into canonical text and embedded
  5. item text is embedded through a pluggable embedding provider
  6. similar historical items are retrieved through a swappable vector backend
  7. the Claude shell runner receives the current item plus bounded retrieval context
  8. usage, retrieval traces, items, todos, and runs are persisted in SQLite

Key design choices:

  • SQLite-first storage keeps the architecture lightweight and inspectable
  • sqlite-vec is the default retrieval backend because it fits the local-first design
  • Qdrant is a first-class optional backend for a more production-style vector-service setup
  • embeddings are pluggable across local and API-backed providers
  • Claude shell execution is intentionally preserved for now so retrieval can be investigated independently of SDK migration
  • LangGraph is intentionally not used because this pipeline is mostly linear

Quick start

uv sync --dev
uv run pytest
uv run mypy
uv run home-agent doctor
uv run home-agent run --debug

Auth setup

Set provider app credentials in .env:

GOOGLE_CLIENT_ID=...
GOOGLE_CLIENT_SECRET=...
MICROSOFT_CLIENT_ID=...
MICROSOFT_TENANT_ID=consumers

Initialize tokens (stored encrypted under .data/tokens):

uv run home-agent auth google --init
uv run home-agent auth microsoft --init

Memory review commands

uv run home-agent memory list-candidates --status pending
uv run home-agent memory approve --rule-key subject_token:university
uv run home-agent memory reject --rule-key subject_token:promo --reason "noise"

Embeddings and retrieval

The embeddings pipeline is additive. It does not replace rules or reviewed memory.

What gets stored:

  • canonical item text used for embeddings
  • compact summary text used for retrieval context
  • embedding vectors keyed by provider/model
  • retrieval traces for both retrieved and prompt_included stages

Supported embedding providers:

  • local_sentence_transformers
  • voyage

Supported retrieval backends:

  • sqlite_vec (default)
  • qdrant (optional)

Useful commands:

uv run home-agent embeddings backfill --dry-run
uv run home-agent embeddings backfill --kind email
uv run home-agent embeddings backfill --limit 100 --batch-size 25
uv run home-agent embeddings backfill --rebuild
uv run home-agent retrieval doctor
uv run home-agent retrieval stats
uv run home-agent retrieval rebuild-index

Why backfill matters:

  • retrieval is weak if the corpus starts empty
  • backfill makes the feature immediately testable on historical items
  • changing rendering or embedding logic can be handled with --rebuild

Example retrieval value

Keyword-only memory can miss cases like:

  • new item: FIT3171 project due Friday
  • older item: assignment deadline tomorrow

Those strings may not share the exact keyword you approved, but they are semantically related. The retrieval layer can surface the older urgent item and pass it to the LLM as prior context.

Another example:

  • new item: final notice: action required
  • older item: urgent submission reminder

If the older item was previously classified as high urgency, retrieval can make the new decision more consistent and easier to explain later.

Vector backend choices

sqlite-vec

Use sqlite-vec when you want:

  • local-first runtime
  • one-database deployment
  • minimal infrastructure
  • a strong “pick the right tool for the scale” engineering story

This is the default backend in home-agent.

Qdrant

Use Qdrant when you want:

  • a dedicated vector service
  • a stronger production-style portfolio signal
  • easier future growth toward larger corpora or service-based deployment

This backend is optional and selected through config.

LLM runners

  • Scheduler default uses Claude via claude -p --output-format json
  • Codex wiring is available via codex exec --json
  • Usage and cost metadata are persisted in SQLite table llm_usage
  • Retrieval context is appended to the prompt in a bounded form rather than dumping raw historical content

Logging

Runtime logs now go to .data/logs/home-agent.jsonl by default.

  • uv run home-agent run writes structured JSON logs and keeps raw shell payloads off by default
  • uv run home-agent run --debug enables debug console logging and raw Claude/Codex stdout/stderr capture
  • uv run home-agent run --raw-shell-io enables raw subprocess output capture without changing the rest of the console verbosity
  • uv run home-agent run --log-dir .data/custom-logs overrides the log directory for that run

Useful fields in the JSON log:

  • run lifecycle: orchestrator.run.start, orchestrator.run.completed
  • plugin and item flow: orchestrator.plugin.collection.completed, orchestrator.item.processed
  • subprocess boundaries: llm.claude.command.*, llm.codex.command.*, notifications.toast.*

Config file support:

[logging]
directory = ".data/logs"
file_name = "home-agent.jsonl"
console_level = "INFO"
file_level = "DEBUG"
capture_raw_payloads = false
subprocess_preview_chars = 4000

Observability

SQLite persists:

  • runs
  • items
  • todos
  • memory_candidates
  • memory_rules
  • memory_reviews
  • budget
  • llm_usage
  • item_text_representations
  • item_embeddings
  • retrieval_events

This makes the system inspectable after each run instead of relying on prompt anecdotes.

Vector storage details:

  • relational source-of-truth data stays in SQLite
  • sqlite-vec uses an in-database vector index when selected
  • Qdrant stores vectors externally while retrieval traces still persist in SQLite

Development workflow

Implementation in this repo is intended to follow:

  • test-driven development with vertical red-green-refactor slices
  • strict typing with mypy --strict
  • one commit per completed phase of work

Current verification commands:

uv run pytest
uv run mypy

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

home_agent-0.1.1.tar.gz (178.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

home_agent-0.1.1-py3-none-any.whl (50.6 kB view details)

Uploaded Python 3

File details

Details for the file home_agent-0.1.1.tar.gz.

File metadata

  • Download URL: home_agent-0.1.1.tar.gz
  • Upload date:
  • Size: 178.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.10.12 {"installer":{"name":"uv","version":"0.10.12","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for home_agent-0.1.1.tar.gz
Algorithm Hash digest
SHA256 5e8c13ad18082e2aeffcaada70f298e28841174f91d90f3fee6242d18f2eb868
MD5 a05e51984d72128399a52b726e0c0f05
BLAKE2b-256 833fc3322cfb069cf41cc684ccd5406d1c5e9fa56a821195e7990d3bc0e9a225

See more details on using hashes here.

File details

Details for the file home_agent-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: home_agent-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 50.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.10.12 {"installer":{"name":"uv","version":"0.10.12","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for home_agent-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 55420e8be0e03266071f36b4c1db59a0f0234fc07f2ff0efa64e8f7af15e559f
MD5 bff7ce22779a0c50e8d11b312c6d97c7
BLAKE2b-256 4f896d97ab36f2f07e0beef8afec0f5751812db14a0fc88237c5101ae93d7f06

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page