Windows personal ops agent with pluggable providers
Project description
home-agent
Windows-friendly personal operations agent for triaging email and calendar items with:
- provider plugins
- deterministic urgency rules
- human-reviewed memory rules
- Claude/Codex classification
- semantic retrieval over historical items
- swappable vector search backends (
sqlite-vecorQdrant) - SQLite-backed observability for costs, runs, and retrieval traces
Why this project exists
home-agent is built around a simple problem: inboxes and calendars contain operational risk, but most personal automation tools are either brittle rule engines or opaque LLM demos.
This project aims for a middle ground:
- deterministic rules handle obvious signals
- reviewed memory rules capture repeated patterns
- retrieval finds semantically similar historical items
- the LLM gets compact prior context before making the final urgency call
The result is a small local-first AI system that is easier to reason about than a generic agent loop.
Architecture
Current classification flow:
- provider plugins collect recent email and calendar items
- rule scoring assigns a first-pass urgency score
- approved memory rules can boost priority deterministically
- items are rendered into canonical text and embedded
- item text is embedded through a pluggable embedding provider
- similar historical items are retrieved through a swappable vector backend
- the Claude shell runner receives the current item plus bounded retrieval context
- usage, retrieval traces, items, todos, and runs are persisted in SQLite
Key design choices:
- SQLite-first storage keeps the architecture lightweight and inspectable
sqlite-vecis the default retrieval backend because it fits the local-first designQdrantis a first-class optional backend for a more production-style vector-service setup- embeddings are pluggable across local and API-backed providers
- Claude shell execution is intentionally preserved for now so retrieval can be investigated independently of SDK migration
- LangGraph is intentionally not used because this pipeline is mostly linear
Quick start
uv sync --dev
uv run pytest
uv run mypy
uv run home-agent doctor
uv run home-agent run --debug
Auth setup
Set provider app credentials in .env:
GOOGLE_CLIENT_ID=...
GOOGLE_CLIENT_SECRET=...
MICROSOFT_CLIENT_ID=...
MICROSOFT_TENANT_ID=consumers
Initialize tokens (stored encrypted under .data/tokens):
uv run home-agent auth google --init
uv run home-agent auth microsoft --init
Memory review commands
uv run home-agent memory list-candidates --status pending
uv run home-agent memory approve --rule-key subject_token:university
uv run home-agent memory reject --rule-key subject_token:promo --reason "noise"
Embeddings and retrieval
The embeddings pipeline is additive. It does not replace rules or reviewed memory.
What gets stored:
- canonical item text used for embeddings
- compact summary text used for retrieval context
- embedding vectors keyed by provider/model
- retrieval traces for both
retrievedandprompt_includedstages
Supported embedding providers:
local_sentence_transformersvoyage
Supported retrieval backends:
sqlite_vec(default)qdrant(optional)
Useful commands:
uv run home-agent embeddings backfill --dry-run
uv run home-agent embeddings backfill --kind email
uv run home-agent embeddings backfill --limit 100 --batch-size 25
uv run home-agent embeddings backfill --rebuild
uv run home-agent retrieval doctor
uv run home-agent retrieval stats
uv run home-agent retrieval rebuild-index
Why backfill matters:
- retrieval is weak if the corpus starts empty
- backfill makes the feature immediately testable on historical items
- changing rendering or embedding logic can be handled with
--rebuild
Example retrieval value
Keyword-only memory can miss cases like:
- new item:
FIT3171 project due Friday - older item:
assignment deadline tomorrow
Those strings may not share the exact keyword you approved, but they are semantically related. The retrieval layer can surface the older urgent item and pass it to the LLM as prior context.
Another example:
- new item:
final notice: action required - older item:
urgent submission reminder
If the older item was previously classified as high urgency, retrieval can make the new decision more consistent and easier to explain later.
Vector backend choices
sqlite-vec
Use sqlite-vec when you want:
- local-first runtime
- one-database deployment
- minimal infrastructure
- a strong “pick the right tool for the scale” engineering story
This is the default backend in home-agent.
Qdrant
Use Qdrant when you want:
- a dedicated vector service
- a stronger production-style portfolio signal
- easier future growth toward larger corpora or service-based deployment
This backend is optional and selected through config.
LLM runners
- Scheduler default uses Claude via
claude -p --output-format json - Codex wiring is available via
codex exec --json - Usage and cost metadata are persisted in SQLite table
llm_usage - Retrieval context is appended to the prompt in a bounded form rather than dumping raw historical content
Logging
Runtime logs now go to .data/logs/home-agent.jsonl by default.
uv run home-agent runwrites structured JSON logs and keeps raw shell payloads off by defaultuv run home-agent run --debugenables debug console logging and raw Claude/Codex stdout/stderr captureuv run home-agent run --raw-shell-ioenables raw subprocess output capture without changing the rest of the console verbosityuv run home-agent run --log-dir .data/custom-logsoverrides the log directory for that run
Useful fields in the JSON log:
- run lifecycle:
orchestrator.run.start,orchestrator.run.completed - plugin and item flow:
orchestrator.plugin.collection.completed,orchestrator.item.processed - subprocess boundaries:
llm.claude.command.*,llm.codex.command.*,notifications.toast.*
Config file support:
[logging]
directory = ".data/logs"
file_name = "home-agent.jsonl"
console_level = "INFO"
file_level = "DEBUG"
capture_raw_payloads = false
subprocess_preview_chars = 4000
Observability
SQLite persists:
runsitemstodosmemory_candidatesmemory_rulesmemory_reviewsbudgetllm_usageitem_text_representationsitem_embeddingsretrieval_events
This makes the system inspectable after each run instead of relying on prompt anecdotes.
Vector storage details:
- relational source-of-truth data stays in SQLite
sqlite-vecuses an in-database vector index when selectedQdrantstores vectors externally while retrieval traces still persist in SQLite
Development workflow
Implementation in this repo is intended to follow:
- test-driven development with vertical red-green-refactor slices
- strict typing with
mypy --strict - one commit per completed phase of work
Current verification commands:
uv run pytest
uv run mypy
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file home_agent-0.1.1.tar.gz.
File metadata
- Download URL: home_agent-0.1.1.tar.gz
- Upload date:
- Size: 178.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.10.12 {"installer":{"name":"uv","version":"0.10.12","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5e8c13ad18082e2aeffcaada70f298e28841174f91d90f3fee6242d18f2eb868
|
|
| MD5 |
a05e51984d72128399a52b726e0c0f05
|
|
| BLAKE2b-256 |
833fc3322cfb069cf41cc684ccd5406d1c5e9fa56a821195e7990d3bc0e9a225
|
File details
Details for the file home_agent-0.1.1-py3-none-any.whl.
File metadata
- Download URL: home_agent-0.1.1-py3-none-any.whl
- Upload date:
- Size: 50.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.10.12 {"installer":{"name":"uv","version":"0.10.12","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
55420e8be0e03266071f36b4c1db59a0f0234fc07f2ff0efa64e8f7af15e559f
|
|
| MD5 |
bff7ce22779a0c50e8d11b312c6d97c7
|
|
| BLAKE2b-256 |
4f896d97ab36f2f07e0beef8afec0f5751812db14a0fc88237c5101ae93d7f06
|