Skip to main content

Thin CLI for the hosted Ask Monarch source plane

Project description

Ask Monarch

Teammate Setup

Install Ask Monarch for a teammate with one command:

uvx ask-monarch setup --code MNR-BUSE-U8E8

The code is stable until Josh or an admin rotates it. This command installs or refreshes the persistent CLI, exchanges the setup code, configures Codex and Claude Code, runs doctor, runs the canonical verification query, and reports the source id.

Ask Monarch is the conversational front door to its memory and capabilities.

A teammate should be able to ask Monarch what it knows, inspect the evidence, challenge the answer, and improve the memory as new artifacts appear.

Under the hood, Ask Monarch does not ask the live bucket or a loose pile of files every time. It follows the source contract:

map the source -> access it -> enumerate it -> parse it into atomic units -> verify receipts -> make it queryable

Once a source passes that gate, Ask Monarch answers from the parsed indexes with provenance. That is what makes answers fast, repeatable, and inspectable.

Read ASKMONARCH-SPEC.md for the deeper product contract and future direction.

Architecture

Ask Monarch is artifact-first: raw company artifacts flow through the monarch-source contract into a searchable store, then the ask-monarch CLI is the supported ask path.

Ask Monarch architecture

Runtime: the GitHub repo manintheandes/ask-monarch is deployed to the ask-monarch-server VM; the CLI runs on the user or agent machine and calls the hosted API on that VM.

Visual Explainers

Ask Monarch's architecture and assay explainers are repo-owned visual assets. Manim scenes live in visuals/manim. Remotion demo videos live in videos. Both are scoped away from the hosted API and CLI runtime.

cd visuals/manim
uv run manim -pql scenes.py AskMonarchArchitecture

For assay visuals, start with Manim's plotting examples and map source-backed rows onto axes, curves, markers, guide lines, and highlighted regions. See visuals/manim/README.md.

For the current visual evidence loop, run the root verifier:

python3 scripts/verify_visual_evidence.py

That checks the source-action hook, overlay resolver, assay-overlay CLI dry-run, Manim brand palette, reusable Manim assay templates, and a smoke still render.

For Remotion video work, use the repo-owned brand tokens in videos/src/brand.ts and validate with npm run validate:brand from the videos folder.

Current State

Ask Monarch currently has ten verified source lanes. For the fuller human-readable source list, see docs/source-lanes.md.

Source lane What it covers Status
bucket gs://monarch-videos-new, including experiment outputs, CSV rows, spreadsheets, images, videos, PDFs, model metadata, structured sidecars, and serving tables for recurring assay questions including DART, contact/no-contact, phytotoxicity, fly oviposition, and DBM oviposition image evidence Parsed and queryable
presentations 37 Monarch update decks from the Storyboards_Presentations Google Drive folder, scoped to January 1, 2026 onward Parsed and queryable
company_profile Monarch mission, problem, status quo, approach, headquarters, organisms, and crop-testing context Parsed and queryable
people_directory Monarch employees, advisors, and partnership facts Parsed and queryable
company_context Company context from Monarch_for_askmonarch.pdf, parsed into PDF pages, text blocks, and embedded-image references Parsed and queryable
teams_recaps Recurring Monarch R&D, Monarch Weekly, and Monarch Team Sync Teams meeting transcripts, refreshed daily after midnight Parsed and queryable
scientific_refs Scoped scientific reference library, including chemical ecology, receptor/pathway, modeling, assay-prototype, scalability, EPA registration-support, and Teams-discussed journal-article material Parsed and queryable
milestones_doc monarch_milestones Google Doc, scoped to one fetched revision and parsed into paragraph units Parsed and queryable
compound_inventory Monarch compound / chemical inventory workbook from the user-provided Updated Chemical Inventory.xlsx attachment Parsed and queryable
chemistry_analysis gs://monarch-chemistry-analysis, including chemical activity registry rows, docking result rows, JSON score sidecars, protein-structure metadata, model metadata, images, scripts, and text sidecars Parsed and queryable

Current presentation receipt:

  • 37 decks
  • 557 slides
  • 1,087 slide text or notes units
  • 673 media/chart/link references
  • 0 explicit gaps

The bucket receipt currently proves all live bucket objects are parsed and that CSV rows are materialized atomically in SQLite. The large CSV-row count is not an animal count; it is source-unit count from processed frame/result files.

Current Teams recap receipt:

  • 24 meeting instances across R&D, Weekly, and Team Sync through May 12, 2026
  • 24 matching transcript files
  • 12,595 speaker/timestamp/utterance units
  • 0 explicit gaps

Current company profile receipt:

  • 7 company profile units
  • 0 explicit gaps

Current people directory receipt:

  • 15 people, advisor, and partnership units
  • 0 explicit gaps

Current company context receipt:

  • 22 PDF pages
  • 89 text blocks
  • 59 embedded-image references
  • 171 total atomic units
  • 0 explicit gaps

Current milestones doc receipt:

  • 1 Google Doc revision
  • 30 paragraph units
  • 0 tables
  • 0 comments
  • 0 explicit gaps

Current compound / chemical inventory receipt:

  • 8 workbook sheets
  • 1,806 rows with values
  • 9,280 non-empty cell units
  • 0 explicit gaps
  • Original SharePoint link access still failed for the current connector identity; the parsed source is the attached workbook copy.

Current chemistry analysis receipt:

  • 11,302 bucket objects
  • 550,576 CSV row units
  • 30,412 JSON units
  • 99,206 text line units
  • 3,637 protein-structure metadata units
  • 0 explicit gaps

Current Braintrust receipt:

  • Project: Andes / ask-monarch
  • Project id: 28216333-66fb-42a6-923b-622bda7a2fcc
  • Query responses include trace_id, trace_span_id, trace_project, trace_project_id, and trace_url when BRAINTRUST_API_KEY is configured.
  • Query traces include spans for trace.user_question, trace.tool_call.1.ask_monarch_source_query, trace.selected_evidence, trace.source_receipts, and trace.final_answer.

How To Ask

In any capable coding or research agent, use the repo-owned Ask Monarch skill:

what is the most effective compound against mosquitoes [$askmonarch](/Users/josh/.codex/skills/askmonarch/SKILL.md)

The user should not need to know which source, database, table, bucket, or deck contains the answer. $askmonarch means: use the skill's routing rules, call the ask-monarch CLI for evidence, answer from the returned rows, and say clearly when the memory does not yet contain enough evidence.

Personal mailbox access is a live overlay, not shared company memory. A mapped Josh-only Ask Monarch token can access the shared Monarch source plane and the mapped josh@monarchcrops.com mailbox through Microsoft Graph; the shared/team token cannot access personal mail, and personal mail is not indexed into shared SQLite.

For direct shell use, the remote CLI calls the hosted Ask Monarch service:

ask-monarch health
ask-monarch summary
ask-monarch query "what are the most effective repellents?"
ask-monarch query "list the experiments from last week" --limit 10
ask-monarch query "what did today's presentation cover?" --source presentations --limit 8
ask-monarch presentation --latest --open
ask-monarch presentation --date 2026-01-02 --open
ask-monarch query "what did the latest R&D meeting cover?" --source teams_recaps --limit 8
ask-monarch query "what does Monarch do?" --source company_profile --limit 8
ask-monarch query "who works at Monarch?" --limit 50
ask-monarch query "what does the company context say about residues?" --source company_context --limit 8
ask-monarch search "2pp mosquito" --limit 5
ask-monarch search damage --source presentations --limit 5
ask-monarch search "spatial repellents" --source teams_recaps --limit 5
ask-monarch search "acetone" --source compound_inventory --limit 5
ask-monarch company-profile-search mission --limit 5
ask-monarch people-search advisor --kind advisors --limit 10
ask-monarch company-context-search glossary --limit 5
ask-monarch query "62 chemicals" --source milestones_doc --limit 5
ask-monarch milestones-search "62 chemicals" --limit 5
ask-monarch compound-inventory-search acetone --limit 5
ask-monarch search DEET --source chemistry_analysis --limit 5
ask-monarch sql "SELECT COUNT(*) AS rows FROM csv_rows" --source chemistry_analysis
ask-monarch csv-search IR3535 --name-like combined_results.csv --limit 5
ask-monarch sql "SELECT COUNT(*) AS slides FROM slide_units" --source presentations
ask-monarch --open-trace query "what does Monarch do?" --source company_profile --limit 5
ask-monarch trace-url TRACE_ID
ask-monarch personal-mail-status
ask-monarch personal-mail-search "mosquito field trial" --days 30 --limit 10
ask-monarch personal-mail-read GRAPH_MESSAGE_ID
ask-monarch personal-mail-draft --to person@example.com --subject "Subject" --body "Body"
ask-monarch personal-mail-reply-draft GRAPH_MESSAGE_ID --body "Reply body"
ask-monarch personal-mail-send-draft GRAPH_DRAFT_ID

The CLI returns JSON so Codex, Claude Code, shell scripts, and dashboards can consume the same source evidence reliably.

How Corrections Improve Ask Monarch

When Ask Monarch gives a weak answer, the fix should become part of the system, not just part of one chat. There are three places to make that correction.

For source-question regressions, run the live drill:

python3 scripts/run_askmonarch_regression.py

The drill replays known tricky questions against the hosted source service and checks both the returned source values and the $askmonarch skill rules that should route future agents correctly. When a case fails, reproduce it manually, patch the skill/source/parser/CLI layer that caused the miss, then add or tighten the case before rerunning the drill.

Latency counts too. The drill prints wall-clock time per case, summarizes the slowest case, and fails any case over the default per-case budget of 10 seconds. Use --default-max-seconds to tune the budget for a run, or set max_seconds on a case that deserves a tighter or looser threshold.

When intentionally failure-mining, use plausible teammate questions, not impossible trivia. Count a failure when the hosted source path returns no usable evidence, the route is ambiguous, units cannot be verified, or the answer would require inventing a linkage. Each useful failure should leave behind a durable rule, a regression case, or a named source/tooling gap.

For answer-quality regressions, run the live answer contract gate:

python3 scripts/verify_askmonarch_answer_quality.py

For a clean pass/fail check that does not rewrite the committed ledger outputs:

python3 scripts/verify_askmonarch_answer_quality.py --check-only

That gate checks that composed answers include the requested factual claims, that evidence rows have citations, that broad questions cite multiple source lanes when needed, and that negative/source-gap answers name the checked sources instead of making absolute claims over unqueryable artifacts. It writes:

evals/askmonarch_answer_quality_results.json
evals/askmonarch_answer_quality_findings.md
evals/askmonarch_answer_quality_ledger.html

Each ledger case must explain what failed, why it failed, how the correction was made durable, and the exact verification command that proves the correction. See evals/askmonarch_quality_probe_summary_2026-05-03.md and evals/askmonarch_quality_probe_summary_2026-05-03.html for the May 3 hard probe refresh summary.

1. Fix the agent instructions

Update the $askmonarch skill when Codex or Claude misunderstood the question or chose the wrong source.

Example: if someone asks "how many mosquitoes have we used?" the skill should say to use the assay ledger's n_mosquitoes field, not CSV frame rows.

Example: if someone asks "what compounds does Monarch have in inventory?", the skill should route to compound_inventory and count or list Master!D compound names, not bucket search snippets. If they ask "where is X?", return the matched Master row fields such as Location/Bin, Vendor, CAS, Quantity, SDS, and cell provenance.

Example: if someone asks "what experiments were run today?", the skill should resolve today to an absolute date, check the assay ledger by experiment date, then separately check bucket object updates for new uploaded or processed data. If the two disagree, say that run date and upload date are different clocks.

2. Fix the shared rulebook

Update the README or spec when the rule should be visible to everyone working on Ask Monarch.

Example: if presentations are queryable only from January 1, 2026 onward, the spec should say that clearly so future agents do not treat every old deck as source-grade.

3. Fix the tool

Add or improve an API/CLI command when the same question will come up again and should not depend on a fresh ad hoc SQL query each time.

Example: if animal-use questions become common, add a command like ask-monarch animal-use mosquitoes that always uses the right ledger field and returns the right evidence.

For most Monarch teammates, the workflow stays simple: ask through Codex, Claude, or another agent and inspect the evidence. Behind the scenes, the agent uses the skill and ask-monarch CLI. Power users can also use the remote CLI directly. Everyone benefits when recurring corrections are captured in these shared layers.

Hosted Service

Ask Monarch is deployed on a Google Compute Engine VM.

Piece Value
VM ask-monarch-server
Google Cloud project gen-lang-client-0407939408
Zone us-central1-a
Repo clone on VM /home/josh/ask-monarch
Public HTTPS endpoint https://ask-monarch.34.121.138.236.sslip.io
Local VM service http://127.0.0.1:8787
Systemd service ask-monarch-http.service

The service is read-only. Public traffic enters through HTTPS and bearer-token auth, then nginx proxies to the private localhost service on the VM.

Health check:

curl -H "Authorization: Bearer $ASK_MONARCH_TOKEN" \
  https://ask-monarch.34.121.138.236.sslip.io/health

Unauthenticated requests should return 401.

Team Onboarding

Do not send broad team invites until the clean-teammate simulation passes and the Codex app, Codex terminal, Claude app, and Claude Code terminal surfaces are verified or explicitly recorded as manually confirmed.

Admin preflight:

python3 scripts/simulate_teammate_onboarding.py --mode temp-home

The simulator creates a fresh one-time setup code, installs into a fake clean home, verifies Codex and Claude global setup, runs ask-monarch doctor, runs source-plane parity checks, and asks canonical Monarch questions for answer parity. It writes reports under output/onboarding-simulations/.

For each teammate, create one personal setup code:

ask-monarch onboarding-code create --email person@monarchcrops.com --label "Person setup" --ttl-hours 168

Send that teammate only their own install command:

ASK_MONARCH_SETUP_CODE='MNR-XXXX-YYYY' bash -c 'curl -fsSL https://ask-monarch.34.121.138.236.sslip.io/install.sh | bash'
ask-monarch doctor

After installation, they should open a fresh Codex or Claude thread before asking Monarch questions.

Source Inspection

You can also pull up assay videos during Q&A. A useful Ask Monarch answer should not stop at a numeric result when the underlying assay is inspectable. When an answer identifies an exact assay row, source video path, or filename, the CLI can resolve the linked raw video from the verified bucket index, download it into the local media cache, and open it:

ask-monarch assay-media --filename IMG_0049_Contact_3PP.mov --open
ask-monarch assay-media --row 1255 --source-name mosquitoes/Mosquito_DART_assay_Results_GCS.xlsx --sheet Assays --open

This is the explicit raw-media exception path. Ask Monarch still answers from verified source indexes first; raw videos are opened only after the indexed source record has been resolved. If a raw video is not linked yet, the command returns diagnostics instead of pretending the video is available.

Assay metric overlays are the next inspectable layer. When an assay has processed metrics, Ask Monarch should be able to move from:

Peak mean avoidance was 0.683 at 05:30.

to:

Peak mean avoidance was 0.683 at 05:30.
Source: advanced_metrics.csv#row=6.
Artifact: raw assay video with the avoidance curve overlaid.

The assay-overlay command does that for supported processed metric CSVs. DART and related tube assays use advanced_metrics.csv with T*_avoidance columns. Contact/non-contact repellency videos use frame_metrics.csv with contact_pi and inside/on-band/outside counts. The command resolves the matching raw video from the verified bucket index and renders a synced time-lapse overlay. The answer source is still the verified SQLite row; the overlay is an inspection artifact. Today this runs from the repo/VM environment where the verified SQLite index is present. With --open, local assay videos and rendered overlays open in a browser player.

ask-monarch assay-overlay \
  --metric-csv 'mosquitoes/processed/_2026_04_28_14_44_46_BHomocyclo3_mov_results/_2026_04_28_14_44_46_BHomocyclo3_mov_results/advanced_metrics.csv' \
  --output videos/out/mosquito-bhomocyclo-overlay.mp4 \
  --speed 15 \
  --open

The renderer automatically discovers T*_avoidance columns, narrows to the assay tubes when source metadata declares them, resolves the raw video object, downloads it into videos/source/, and prints a proof payload with the peak row, timestamp, source CSV provenance, raw media URI, and output video probe.

Fast verification:

python3 scripts/verify_assay_metric_overlays.py

Two verified examples:

# Fly repellency: 2PP, treatment tubes T5-T8, peak 50:30 / 0.654
ask-monarch assay-overlay \
  --metric-csv 'flies/processed/2026-04-22 12-12-14_results/2026-04-22 12-12-14_results/advanced_metrics.csv' \
  --output videos/out/fly-2pp-overlay.mp4 \
  --speed 30 \
  --open

# Mosquito DART: BHomocyclo, tubes T1-T3, peak 05:30 / 0.683
ask-monarch assay-overlay \
  --metric-csv 'mosquitoes/processed/_2026_04_28_14_44_46_BHomocyclo3_mov_results/_2026_04_28_14_44_46_BHomocyclo3_mov_results/advanced_metrics.csv' \
  --output videos/out/mosquito-bhomocyclo-overlay.mp4 \
  --speed 15 \
  --open

# Fly non-contact repellency: 2PP, frame-level contact_pi
ask-monarch assay-overlay \
  --metric-csv 'contact_repel_videos/flies/processed/2PP_non_contact/frame_metrics.csv' \
  --output videos/out/fly-2pp-non-contact-overlay.mp4 \
  --speed 15 \
  --open

The R&D presentation lane is openable too. Indexed Monarch update decks start with the January 2, 2026 presentation, and ask-monarch presentation --open opens the matching Google Slides locator returned by the verified presentations index:

ask-monarch presentation --latest --open
ask-monarch presentation --date 2026-01-02 --open
ask-monarch presentation --query "3-Step Process" --open

Every query row that has source provenance can also return source_actions. Those actions are the provenance resolver layer: they do not change the answer source, which remains verified SQLite3, but they tell a client how to inspect the underlying artifact when one is available. A returned action can point to a raw video, Google Slides deck or slide, transcript annotation, CSV row, workbook cell, PDF page, image, assay metric overlay, or other artifact. Power users can open any returned locator directly:

ask-monarch resolve-source --locator 'https://docs.google.com/presentation/d/.../edit#slide=2' --open
ask-monarch resolve-source --locator 'gs://monarch-videos-new/path/to/video.mov' --open
ask-monarch assay-overlay --locator 'gs://monarch-videos-new/path/to/advanced_metrics.csv#row=6' --open
ask-monarch assay-overlay --locator 'gs://monarch-videos-new/path/to/frame_metrics.csv#row=10' --open

The installer saves the token locally at ~/.config/ask-monarch/config.json. To reconfigure it later:

ask-monarch configure

See docs/remote-cli.md for the full CLI reference.

Agent Skills

This repo owns the agent skills/instructions that make the experience feel like asking the company rather than running a database query:

skills/askmonarch/SKILL.md
skills/braintrust/SKILL.md
skills/correction/SKILL.md
skills/monarch-source/SKILL.md

The hosted installer writes:

  • Codex global instructions: ~/.codex/AGENTS.md
  • Codex skills: ~/.codex/skills/
  • Claude global instructions: ~/.claude/CLAUDE.md
  • Claude skill: ~/.claude/skills/ask-monarch/

Install or refresh repo-owned Codex skills on the VM with:

python3 scripts/install_codex_skills.py

monarch-source defines when a Monarch artifact becomes source-grade. $askmonarch is the conversational front door over source-grade Monarch memory. correction defines what to do when an Ask Monarch answer needs a durable routing or answer fix. braintrust defines the Ask Monarch trace inspection and verification workflow. The canonical text is host-agnostic: Codex, Claude, Gemini, Pi, OpenCode, and other agents should follow the same source-routing contract when they can run the CLI or call the hosted service.

Source Lanes

Bucket

  • Source id: monarch_videos_new
  • Bucket: gs://monarch-videos-new
  • Project: gen-lang-client-0407939408
  • Parser: scripts/source_pipeline.py
  • Verification: scripts/verify_source_complete.py
  • Index on VM: /home/josh/ask-monarch/artifacts/monarch-videos-new/source_index.sqlite

Useful commands:

python3 scripts/parse_monarch_bucket.py --manifest-only
python3 scripts/source_pipeline.py init
python3 scripts/source_pipeline.py parse --lane structured
python3 scripts/source_pipeline.py parse --lane video
python3 scripts/source_pipeline.py parse --lane model
python3 scripts/source_pipeline.py parse --lane pdf
python3 scripts/source_pipeline.py parse --lane image
python3 scripts/source_pipeline.py build-derived
python3 scripts/source_pipeline.py status
python3 scripts/verify_source_complete.py

build-derived creates the typed assay tables used for exact Q&A, including dart_assay_results and assay_media_links. assay_media_links is the source-grade bridge from an exact assay record, or indexed video metadata when a typed ledger row is not available, to its raw gs:// assay video. The remote CLI exposes it as ask-monarch assay-media ... --open, which can cache and open the video for inspection during an Ask Monarch session.

Presentations

  • Source id: monarch_storyboards_presentations_2026
  • Folder: Storyboards_Presentations
  • Folder id: 1-dL59DdeoPySGbqE5cjGqednyp4CIzch
  • Boundary: Monarch update decks from January 1, 2026 onward
  • Parser: scripts/parse_storyboards_presentations.py
  • Query helper: scripts/query_storyboards_presentations.py
  • Index on VM: /home/josh/ask-monarch/artifacts/storyboards-presentations-2026/source_index.sqlite
  • Receipt: docs/storyboards-presentations-2026-source.md

Useful commands:

ask-monarch presentation --latest --open
ask-monarch presentation --date 2026-01-02 --open
ask-monarch presentation --query "3-Step Process" --open
python3 scripts/parse_storyboards_presentations.py
python3 scripts/query_storyboards_presentations.py 2PP --limit 5
python3 scripts/query_storyboards_presentations.py damage --limit 5

ask-monarch presentation resolves decks through the hosted presentations source index and opens the Google Slides locator stored in deck or slide provenance. This source does not claim the entire historical presentation folder. Older decks, PDFs, notebooks, interviews, Teams messages, emails, academic papers, and dashboards are future source candidates until they pass the same source gate.

Monarch Teams Recaps

  • Source id: monarch_recurring_teams_recaps
  • Source lane: teams_recaps
  • Source system: Microsoft Teams / Microsoft Graph
  • Boundary: recurring Monarch R&D meetings from January 2, 2026 onward, Monarch Weekly meetings from May 5, 2026 onward, and Monarch Team Sync meetings from May 6, 2026 onward
  • Freshness: VM refresh runs just after midnight America/Los_Angeles and includes meetings through the previous Pacific calendar day
  • Refresher: scripts/refresh_teams_recaps_source.py
  • Fetcher: scripts/fetch_monarch_r_and_d_teams_recaps.py
  • Parser: scripts/parse_monarch_r_and_d_teams_recaps.py
  • Query helper: scripts/query_monarch_r_and_d_teams_recaps.py
  • Index on VM: /home/josh/ask-monarch/artifacts/monarch-r-and-d-teams-recaps/source_index.sqlite
  • Receipt: docs/monarch-r-and-d-teams-recaps-source.md

Useful commands:

python3 scripts/refresh_teams_recaps_source.py --no-restart
python3 scripts/query_monarch_r_and_d_teams_recaps.py "spatial repellents" --limit 5

This lane is meeting-transcript memory, not the whole Teams universe. Teams messages, channels, company-wide email sources, papers, dashboards, and older meeting transcripts remain future source candidates until they pass the same source gate. Personal mailbox access, when mapped for a personal token, is a private live overlay and does not promote those emails into shared company memory.

Company Profile

  • Source id: monarch_company_profile
  • Source lane: company_profile
  • Raw source: sources/monarch-company-profile/company_profile.json
  • Parser: scripts/parse_company_lanes.py
  • Index on VM: /home/josh/ask-monarch/artifacts/monarch-company-profile/source_index.sqlite
  • Receipt: docs/monarch-company-profile-source.md

Useful commands:

python3 scripts/parse_company_lanes.py
ask-monarch company-profile-search mission --limit 5
ask-monarch search "where is Monarch based" --source company_profile --limit 5

This lane is durable company framing, not an automatically discovered corporate knowledge base. Update the JSON source when the company framing changes, then rerun the parser.

People Directory

  • Source id: monarch_people_directory
  • Source lane: people_directory
  • Raw source: sources/monarch-people-directory/people_directory.json
  • Parser: scripts/parse_company_lanes.py
  • Index on VM: /home/josh/ask-monarch/artifacts/monarch-people-directory/source_index.sqlite
  • Receipt: docs/monarch-people-directory-source.md

Useful commands:

python3 scripts/parse_company_lanes.py
ask-monarch people-search Avinash --limit 5
ask-monarch people-search advisor --kind advisors --limit 20
ask-monarch sql "select name, role, lane, provenance_grain from units where lane = 'employees' order by name" --source people_directory --limit 50
ask-monarch sql "select name, role, lane, provenance_grain from units order by lane, name" --source people_directory --limit 50

This lane covers people facts explicitly entered into the source JSON. It should not be treated as a full HRIS, inbox, or LinkedIn scrape.

Company Context

  • Source id: monarch_company_context
  • Source lane: company_context
  • Raw source: sources/monarch-company-context/Monarch_for_askmonarch.pdf
  • Boundary: exactly this one 22-page PDF file
  • Parser: scripts/parse_monarch_company_context.py
  • Index on VM: /home/josh/ask-monarch/artifacts/monarch-company-context/source_index.sqlite
  • Receipt: docs/monarch-company-context-source.md

Useful commands:

python3 scripts/parse_monarch_company_context.py
ask-monarch company-context-search glossary --limit 5
ask-monarch search "USDA Insecticide Residue Report" --source company_context --limit 5
ask-monarch sql "select page_number, unit_index, text, provenance_grain from text_units where text like '%USDA%'" --source company_context --limit 20

This is company context sourced from PDF intake evidence. Current employee/advisor questions should use people_directory first; older names in the PDF do not override current people-directory truth.

Milestones Doc

  • Source id: monarch_milestones_doc
  • Source lane: milestones_doc
  • Source system: Google Docs / Google Drive connector
  • Boundary: exactly the monarch_milestones Google Doc revision fetched on May 1, 2026
  • Parser: scripts/parse_monarch_milestones_doc.py
  • Query helper: scripts/query_monarch_milestones_doc.py
  • Index on VM: /home/josh/ask-monarch/artifacts/monarch-milestones-doc/source_index.sqlite
  • Receipt: docs/monarch-milestones-doc-source.md

Useful commands:

python3 scripts/parse_monarch_milestones_doc.py
ask-monarch query "phytotoxicity" --source milestones_doc --limit 5
ask-monarch milestones-search "62 chemicals" --limit 5
ask-monarch sql "select paragraph_index, text, provenance_grain from paragraph_units" --source milestones_doc --limit 5

This lane is a document-revision snapshot. Refetch the Google Doc and rerun the parser before using it for claims about later milestone edits.

Querying Locally

The hosted path is preferred for shared use. Local scripts are still useful for debugging and parser development:

Braintrust Tracing

Ask Monarch query requests emit Braintrust traces when the runtime has the Python dependency and an API key:

python3 -m pip install -r requirements.txt
export BRAINTRUST_API_KEY=...
export ASK_MONARCH_BRAINTRUST_PROJECT=ask-monarch
export ASK_MONARCH_BRAINTRUST_PROJECT_ID=28216333-66fb-42a6-923b-622bda7a2fcc

Tracing fails open: if the SDK or API key is missing, source queries still run and simply omit trace fields. When tracing is active, every /query response includes a Braintrust full trace URL. The local Braintrust context is stored in .bt/config.json.

python3 scripts/query_source.py --summary
python3 scripts/query_source.py --q "Preference Index" --kind pdf_page --limit 5
python3 scripts/query_source.py --csv-q IR3535 --name-like combined_results.csv --limit 5
python3 scripts/query_source.py --sql "select name, row_number, row_json, provenance_grain from csv_rows_with_provenance where row_json like '%IR3535%' limit 5"

See docs/querying-monarch-source.md.

Deep Research Max

Normal Ask Monarch questions should query verified source indexes directly. Deep Research Max is only for long-form external research, literature comparison, scientific synthesis, market/context scans, or visual analysis grounded in Monarch evidence.

Deep Research does this:

research question -> verified multi-source evidence bundle -> Gemini Deep Research Max -> cited report

The bundle currently includes verified lanes such as bucket, presentations, teams_recaps, company_profile, people_directory, and company_context when relevant. The milestones_doc lane is queryable through the hosted HTTP API and remote CLI, but should be added to Deep Research bundles before relying on it in that mode. If a required source lane is missing or has unverified receipts, Deep Research stops before launch instead of producing a partial internal-evidence report.

Bundle-only:

python3 scripts/deep_research_monarch.py "Compare our 2PP fly and mosquito assay signal against published 2-propylphenol repellency literature."

Launch Gemini Deep Research Max:

GEMINI_API_KEY=... python3 scripts/deep_research_monarch.py "Compare our 2PP fly and mosquito assay signal against published 2-propylphenol repellency literature." --launch --wait

Outputs live under:

reports/deep-research/

See docs/deep-research-mode.md.

Daily Refresh

The VM should refresh parsed scheduled source indexes daily just after midnight America/Los_Angeles. The scheduled lanes are bucket (gs://monarch-videos-new), chemistry_analysis (gs://monarch-chemistry-analysis), and teams_recaps recurring Teams meeting transcripts.

The main bucket path is incremental: reuse the current verified SQLite index, enumerate the live bucket, invalidate only added/changed/deleted objects, parse those objects, then verify before replacing the live SQLite files:

live source -> staged incremental parse -> receipts/gaps -> verification -> atomic swap

Repo-owned pieces:

  • scripts/refresh_source_index.py
  • scripts/refresh_chemistry_analysis_source.py
  • scripts/refresh_teams_recaps_source.py
  • scripts/refresh_bucket_sources.py
  • deploy/systemd/ask-monarch-refresh.service
  • deploy/systemd/ask-monarch-refresh.timer
  • docs/daily-source-refresh.md

Operations

VM service commands:

sudo systemctl status ask-monarch-http.service
sudo systemctl restart ask-monarch-http.service
sudo journalctl -u ask-monarch-http.service -n 100 --no-pager
sudo systemctl status nginx
sudo nginx -t

Deploy the latest GitHub code to the VM:

gcloud compute ssh ask-monarch-server \
  --project gen-lang-client-0407939408 \
  --zone us-central1-a \
  --command 'cd /home/josh/ask-monarch && git pull --ff-only origin main && python3 scripts/install_codex_skills.py'

Repo Layout

ASKMONARCH-SPEC.md              product/source contract
config/source-map.yaml          source registry and query-plane map
docs/                           operating docs and source receipts
scripts/                        parsers, query tools, HTTP service, CLI, refresh
skills/                         repo-owned Codex skills
deploy/systemd/                 VM service and refresh units
reports/deep-research/          generated Deep Research bundles and reports

Generated source-plane outputs are intentionally not committed to GitHub. Keep the repo as code, specs, scripts, skills, and docs. Keep large parsed artifacts on the VM or in artifact storage.

Source Contract

A source is not queryable just because it exists. Ask Monarch treats a source as queryable only when:

  • it is listed in config/source-map.yaml
  • it can be accessed through an authenticated route
  • its live contents are enumerated
  • its artifacts are parsed into atomic units
  • each unit has source-grade provenance
  • receipt and gap files are written
  • verification passes
  • query tools can return evidence with exact locators

That contract is the core of the system: do the hard parsing once, verify it, store it in a parsed index, and query the index fast.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ask_monarch-0.1.4.tar.gz (276.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ask_monarch-0.1.4-py3-none-any.whl (186.0 kB view details)

Uploaded Python 3

File details

Details for the file ask_monarch-0.1.4.tar.gz.

File metadata

  • Download URL: ask_monarch-0.1.4.tar.gz
  • Upload date:
  • Size: 276.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for ask_monarch-0.1.4.tar.gz
Algorithm Hash digest
SHA256 a7b5b1fcb57c9927de33205c6029cbe9d4913bd49688e8728961b604dc073a91
MD5 c460e919de4beace67d11e1bcab84a56
BLAKE2b-256 647491d340730f550b2e350ae97e30a00504c603c3cab26c30397919259230a2

See more details on using hashes here.

Provenance

The following attestation bundles were made for ask_monarch-0.1.4.tar.gz:

Publisher: publish.yml on manintheandes/ask-monarch

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ask_monarch-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: ask_monarch-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 186.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for ask_monarch-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 2e6c410127e9e559f958097fc17f996c0a9900a917c133db8481ceb793db0ee3
MD5 09decc3e6c6ed72b949c1da0a9eaf1de
BLAKE2b-256 16a7cc302d9bc61a1388bec9bed3753ca63a67bb90e423aba6d22f82a751a4af

See more details on using hashes here.

Provenance

The following attestation bundles were made for ask_monarch-0.1.4-py3-none-any.whl:

Publisher: publish.yml on manintheandes/ask-monarch

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page