The most precise FHIR agent for Latin America

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

These details have not been verified by PyPI

Project description

SaludAI

The first open-source FHIR reasoning agent with multi-LLM benchmarks and LATAM localization — every reasoning step auditable via Langfuse, designed for public health systems.

Ask: "Pacientes con diabetes tipo 2 mayores de 60 en Buenos Aires"

Get: A structured, sourced answer — with every reasoning step traced in Langfuse.

SaludAI translates clinical questions in natural language into FHIR R4 API calls, resolves medical terminology (SNOMED CT, CIE-10, LOINC, ATC), navigates multi-resource references, and returns traceable answers. Built and tested against Argentine synthetic data on HAPI FHIR. Designed to extend to other FHIR-compliant systems via locale packs.

Multi-LLM Benchmark

Evaluated on 100 questions across 200 synthetic Argentine patients (3,182 FHIR resources, 10 resource types). Inspired by FHIR-AgentBench (Verily/KAIST/MIT).

Model	Accuracy	Simple (16)	Medium (41)	Complex (43)	Errors*	P50 Latency
Claude Sonnet 4.5	84.0%	94%	93%	72%	8	12.7s
Claude Haiku 4.5	77.0%	100%	80%	65%	7	6.6s
GPT-4o	63.0%	100%	73%	40%	3	14.4s
Llama 3.3 70B	48.0%	94%	63%	16%	9	6.5s
Qwen 3.5 9B	25.0%	50%	29%	12%	1	11.8s

*Errors = agent exceeded iteration budget (8 steps) and could not produce an answer. These count as incorrect in the accuracy score.

All models use the same agent loop, tools, and system prompt. Differences reflect reasoning ability, tool calling reliability, and schema handling. Questions cover terminology resolution, multi-hop reference traversal, server-side counting, aggregation, and temporal filtering across 10 FHIR resource types.

Benchmark scope: 100 internally-written questions evaluated on 200 synthetic patients with curated terminology codes. This benchmark tracks our development progress — it is not comparable to clinical benchmarks like FHIR-AgentBench (2,931 clinician-written questions on real de-identified data). We plan to evaluate against their public dataset. See experiment log for detailed methodology and per-question analysis.

Architecture

graph TB
    User["User (NL query)"]

    subgraph Agent["saludai-agent"]
        Planner["Query Planner<br/>(FHIR knowledge graph)"]
        Loop["Agent Loop<br/>(plan → execute → evaluate)"]
        Tools["Tools"]
    end

    subgraph ToolSet["Tool Registry"]
        T1["resolve_terminology<br/>SNOMED CT · CIE-10 · LOINC · ATC"]
        T2["search_fhir<br/>with auto-pagination"]
        T3["count_fhir<br/>server-side _summary=count"]
        T4["get_resource<br/>direct reference lookup"]
        T5["execute_code<br/>sandboxed Python"]
    end

    subgraph Core["saludai-core"]
        FHIR["FHIR Client (httpx)"]
        Term["Terminology Resolver<br/>(rapidfuzz)"]
        QB["Query Builder"]
        Locale["Locale Pack (AR)"]
    end

    HAPI["HAPI FHIR R4<br/>200 patients · 3,182 resources"]
    LLM["LLM Provider<br/>(Anthropic · OpenAI · Ollama)"]
    Langfuse["Langfuse<br/>(observability)"]

    User --> Loop
    Loop --> Planner
    Planner --> Loop
    Loop --> Tools
    Tools --> T1 & T2 & T3 & T4 & T5
    T1 --> Term
    T2 & T3 & T4 --> FHIR
    T5 --> Loop
    FHIR --> HAPI
    Loop --> LLM
    Loop --> Langfuse
    Term --> Locale
    QB --> Locale

Key design decisions:

No LangChain. The agent loop is ~300 lines of Python. Every step is auditable and traceable. We chose simplicity over framework magic — see ADR-002.
Hybrid Query Planner. A plan-and-execute pattern with a FHIR knowledge graph (resource relationships + query pattern catalog). The planner classifies the question and selects a strategy before the agent starts calling tools — see ADR-009.
Action Space Reduction. Instead of suggesting tools via prompt, we remove irrelevant tools from the LLM's context based on the query plan. The model can't misuse what it can't see.
Provider-agnostic. Same agent loop works with Claude, GPT-4o, Llama, or Qwen. Swap the model, keep everything else.

Quick Start

# Clone and install
git clone https://github.com/saludai-labs/saludai.git
cd saludai
uv sync

# Start HAPI FHIR with 200 synthetic Argentine patients
docker compose up -d

# Wait ~30s for seeding, then verify
curl http://localhost:8890/fhir/Patient?_summary=count

# Run the agent
uv run saludai query "¿Cuántos pacientes tienen diabetes tipo 2?"

# Run the benchmark
uv run python -m benchmarks.run_eval

# Run tests (696 tests, 95% coverage)
uv run pytest

Prerequisites: Python 3.12+, UV, Docker

Usage

MCP Server (Claude Desktop / Claude Code / Cursor)

SaludAI exposes its tools via the Model Context Protocol:

# Start MCP server (stdio transport)
uv run saludai-mcp

Add to your MCP client config (claude_desktop_config.json):

{
  "mcpServers": {
    "saludai": {
      "command": "uv",
      "args": ["run", "--directory", "/path/to/saludai", "saludai-mcp"],
      "env": {
        "SALUDAI_FHIR_SERVER_URL": "http://localhost:8890/fhir"
      }
    }
  }
}

REST API

uv run saludai serve
# POST http://localhost:8000/query {"query": "Pacientes con hipertensión en Córdoba"}

CLI

uv run saludai query "Medicaciones más frecuentes en pacientes mayores de 70"

Project Structure

saludai/
├── packages/
│   ├── saludai-core/       # FHIR client, terminology resolver, query builder, locale packs
│   ├── saludai-agent/      # Agent loop, planner, tools, LLM abstraction
│   ├── saludai-mcp/        # MCP server (Claude Desktop, Cursor, etc.)
│   └── saludai-api/        # FastAPI REST interface
├── benchmarks/             # 100-question eval framework + results
├── data/seed/              # Deterministic synthetic data generator (200 AR patients)
├── notebooks/              # Interactive Jupyter demos (3 notebooks)
└── docs/                   # Architecture, ADRs, experiments, roadmap

Built for Latin America

SaludAI is open source, auditable, and self-hostable — built for Argentina's health system, with an architecture designed to scale across Latin America:

Argentine terminology: SNOMED CT Argentine edition, CIE-10 (Argentine adaptation), LOINC, ATC — with fuzzy matching via rapidfuzz
Locale packs: Country-specific bundles of terminology, system prompts, and FHIR metadata. Argentina ships built-in; add your country by implementing a locale pack
openRSD-aware: Locale pack references Argentina's national FHIR profiles
Synthetic data that looks real: 200 patients with Argentine names, DNI, 18 provinces weighted by population
Spanish-first prompts: The agent reasons in the language of the data

from saludai_core.locales import load_locale_pack

pack = load_locale_pack("ar")  # SNOMED CT AR + CIE-10 AR + LOINC + ATC

Observability

Every agent run is fully traced in Langfuse:

Query plan generation (planner output)
Each iteration: LLM call, tool selection, tool execution, result
Token usage and cost per step
Final answer with evaluation score

Set up Langfuse Cloud (free tier) or self-hosted:

export LANGFUSE_PUBLIC_KEY=pk-...
export LANGFUSE_SECRET_KEY=sk-...
export LANGFUSE_HOST=https://cloud.langfuse.com

Notebooks

Notebook	Description
01-getting-started	FHIR client, terminology resolver, query builder
02-agent-queries	Natural language queries with the agent loop
03-benchmark-eval	Run and analyze the FHIR-AgentBench evaluation

Contributing

Contributions are welcome! See CONTRIBUTING.md for development setup, code style, and PR guidelines.

License

Apache 2.0 — see LICENSE for details.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

idemfede

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.0

Mar 31, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

saludai-0.1.0-py3-none-any.whl (11.4 kB view details)

Uploaded Mar 31, 2026 Python 3

File details

Details for the file saludai-0.1.0-py3-none-any.whl.

File metadata

Download URL: saludai-0.1.0-py3-none-any.whl
Upload date: Mar 31, 2026
Size: 11.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for saludai-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`9c62cd669ac828593653d50a8c0a6227c75731158f80bc3a0c5b4343361763c7`
MD5	`4a185eb8894b6bf9526bccabe8496521`
BLAKE2b-256	`27b45d17a078d47414b98b7ced1250cf14c96e5d74fe6e53b494eeee48518155`

See more details on using hashes here.

Provenance

The following attestation bundles were made for saludai-0.1.0-py3-none-any.whl:

Publisher: publish.yml on saludai-labs/saludai

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: saludai-0.1.0-py3-none-any.whl
- Subject digest: 9c62cd669ac828593653d50a8c0a6227c75731158f80bc3a0c5b4343361763c7
- Sigstore transparency entry: 1201648561
- Sigstore integration time: Mar 31, 2026
Source repository:
- Permalink: saludai-labs/saludai@eda559d1986dd792cfb1a3572aa543ca39a799c9
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/saludai-labs
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@eda559d1986dd792cfb1a3572aa543ca39a799c9
- Trigger Event: release

saludai 0.1.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

SaludAI

Multi-LLM Benchmark

Architecture

Quick Start

Usage

MCP Server (Claude Desktop / Claude Code / Cursor)

REST API

CLI

Project Structure

Built for Latin America

Observability

Notebooks

Contributing

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distribution

File details

File metadata

File hashes

Provenance