Skip to main content

FastAPI ASGI middleware for emitting PHI-safe audit events for behavioral healthcare systems

Project description

bh-fastapi-audit

Pure ASGI middleware for emitting PHI-safe audit events for behavioral healthcare systems, designed for teams building modern healthcare APIs.

This project emits audit events conforming to the bh-audit-schema standard (v1.1):
https://github.com/bh-healthcare/bh-audit-schema

Why

Behavioral health systems handle highly sensitive regulated data. Audit logging is often implemented inconsistently across services, making access review and incident investigation unnecessarily difficult.

The goal of this library is to make consistent, structured audit trails easy to adopt in FastAPI services without logging raw PHI.

Status

This project is an implementation layer that turns the bh-audit-schema standard into working FastAPI middleware.

Current version: v0.4.0 — Runtime validation, DENIED outcome with denial callbacks, schema negotiation (v1.0/v1.1), vendored dual-schema support.

v0.4 (current)

  • Runtime event validation — optional validate_events=True checks every event against the vendored JSON schema before emission, with configurable failure modes (drop, log_and_emit, raise)
  • DENIED outcome with denial callbackget_denial_reason callback provides rich denial categories (e.g. RoleDenied, ConsentRequired, CrossOrgAccessDenied) for compliance queries
  • Configurable denied status codesdenied_status_codes config (default {401, 403})
  • Schema negotiationtarget_schema_version config ("1.0" or "1.1") controls which schema version is emitted, enabling gradual migration
  • Vendored dual schemas — both v1.0 and v1.1 schemas bundled for offline validation
  • Validation timingvalidation_time_ms_total counter for monitoring validation overhead

v0.3

  • Pure ASGI middleware — no BaseHTTPMiddleware, supports streaming responses
  • Non-blocking async emission — bounded asyncio.Queue (default 10k events) with background drain task
  • Typed event blocksTypedDict definitions for all event sub-blocks (AuditEvent, ActorBlock, etc.)
  • Frozen configAuditConfig is immutable after creation
  • Schema v1.1 — vendored bh-audit-schema v1.1 with HIPAA/SOC compliance rules, DENIED status, conditional FAILURE validation
  • Schema validation in CI — emitted events validated against the vendored JSON schema
  • PyPI distribution — pip install bh-fastapi-audit
  • PHI-safe defaults (no bodies, safe headers only, error sanitization)
  • Captures: service, actor, action, resource, outcome, correlation, metadata
  • Pluggable sinks:
    • MemorySink — in-memory for testing (bounded optional)
    • JsonlFileSink — JSON Lines file for local dev and demos
    • LoggingSink — Python logging for cloud platforms (CloudWatch, Cloud Logging, Azure Monitor, Kubernetes)
    • SQLAlchemySink — relational database storage (Postgres, SQLite, etc., via SQLAlchemy Core)
  • Redaction utilities for error message sanitization

The bh-audit-schema v1.0 and v1.1 JSON schemas are vendored into this package to enable offline validation.

Quickstart

from fastapi import FastAPI
from bh_fastapi_audit import AuditMiddleware, AuditConfig, MemorySink

app = FastAPI()

sink = MemorySink()
config = AuditConfig(
    service_name="example-bh-api",
    service_environment="dev",
    emit_mode="sync",  # use "queue" (default) for non-blocking in production
)

app.add_middleware(AuditMiddleware, sink=sink, config=config)

@app.get("/patients/{patient_id}")
def get_patient(patient_id: str):
    return {"patient_id": patient_id}

Each request emits an audit event like:

{
  "schema_version": "1.1",
  "event_id": "c1d2e3f4-1111-2222-3333-444455556666",
  "timestamp": "2026-03-28T12:00:00.000Z",
  "service": { "name": "example-bh-api", "environment": "dev" },
  "actor": { "subject_id": "unknown", "subject_type": "service" },
  "action": { "type": "READ", "data_classification": "UNKNOWN" },
  "resource": { "type": "get_patient" },
  "http": { "method": "GET", "route_template": "/patients/{patient_id}", "status_code": 200 },
  "outcome": { "status": "SUCCESS" }
}

Production Example: Container Logging (CloudWatch / GCP / K8s)

from fastapi import FastAPI
from bh_fastapi_audit import AuditMiddleware, AuditConfig, LoggingSink

app = FastAPI()

app.add_middleware(
    AuditMiddleware,
    sink=LoggingSink(logger_name="audit"),
    config=AuditConfig(service_name="my-api", service_environment="prod"),
)

When deployed in containers, audit events are emitted as structured JSON logs to stdout and collected by your platform logging system (CloudWatch, Cloud Logging, Azure Monitor, Fluentd, etc.). No SDK dependencies required.

Non-blocking async emission

v0.3 defaults to emit_mode="queue" — events are enqueued without blocking the request path, then emitted by a background task:

config = AuditConfig(
    service_name="my-api",
    emit_mode="queue",       # default — non-blocking
    queue_size=10_000,       # default — bounded to prevent unbounded memory growth
    queue_drain_timeout=5.0, # seconds to wait on shutdown
)

When the queue is full, events are dropped and events_dropped_total is incremented. Call await middleware.shutdown() on app shutdown to drain remaining events.

Use emit_mode="sync" for testing or when you need deterministic ordering.

Production hardening

Sink failure isolation

By default, sink failures are logged but never break your request handling:

config = AuditConfig(
    service_name="my-api",
    emit_failure_mode="log",       # "silent", "log" (default), or "raise"
    failure_logger_name="bh.audit.internal",
)
  • "silent" — swallow errors, increment counter only
  • "log" — log a compact summary (event_id, service, action, resource) without the full payload
  • "raise" — re-raise the original exception (use in dev/test)

Client IP opt-in

Client IP is excluded from audit events by default. Enable explicitly:

config = AuditConfig(
    service_name="my-api",
    include_client_ip=True,   # default: False
)

Metadata restrictions

Metadata values are enforced to be scalar JSON types (str, int, float, bool, None). Dict, list, and tuple values are silently dropped. Long strings are truncated:

config = AuditConfig(
    service_name="my-api",
    metadata_allowlist=frozenset({"content_length", "status_family"}),
    max_metadata_value_length=200,
    get_metadata=lambda req, status: {"content_length": req.headers.get("content-length")},
)

Internal counters

Track emission health via the middleware's stats:

# After app startup, access via the middleware instance:
# middleware.stats.snapshot()
# {"events_emitted_total": 42, "emit_failures_total": 0, ...}

Sinks

Sinks determine where audit events are stored. Choose based on your deployment:

MemorySink (testing)

from bh_fastapi_audit import MemorySink

sink = MemorySink()           # unbounded — for tests
sink = MemorySink(maxlen=100) # bounded — for dev
# After requests: sink.events contains all emitted events

JsonlFileSink (local dev, demos)

Writes one JSON object per line. Thread-safe, flushes by default.

from bh_fastapi_audit import JsonlFileSink

sink = JsonlFileSink("/var/log/audit/events.jsonl")

LoggingSink (cloud deployments)

Emits one compact JSON audit event per request using Python logging.

from bh_fastapi_audit import LoggingSink

sink = LoggingSink(logger_name="bh.audit", level="INFO")

No SDK dependencies, no retries, no buffering. The cloud platform handles collection.

SQLAlchemySink (production database)

Stores events in a relational database with query-friendly columns plus full JSON.

from bh_fastapi_audit import SQLAlchemySink

sink = SQLAlchemySink("postgresql://user:pass@localhost/mydb")

See docs/indexing.md for recommended database indexes and query examples.

Runtime Event Validation

v0.4 adds optional runtime validation against the vendored JSON schema:

config = AuditConfig(
    service_name="my-api",
    validate_events=True,                       # enable validation
    validation_failure_mode="log_and_emit",      # "drop" (default), "log_and_emit", or "raise"
)
  • "drop" — increment counters and silently discard invalid events
  • "log_and_emit" — log validation errors but still emit the event
  • "raise" — raise AuditValidationError (use in dev/test)

Validation timing is tracked in stats.snapshot()["validation_time_ms_total"].

DENIED Outcome and Denial Callbacks

HTTP 401/403 produce outcome.status: "DENIED" (v1.1) with an error_type. Provide a callback for richer denial categories:

def denial_reason(request, exc_info):
    if exc_info and "consent" in exc_info[0].lower():
        return "ConsentRequired"
    return None  # fall back to default

config = AuditConfig(
    service_name="my-api",
    get_denial_reason=denial_reason,
    denied_status_codes=frozenset({401, 403, 451}),  # customize
)

Schema Negotiation

Control which schema version emitted events conform to:

config = AuditConfig(
    service_name="my-api",
    target_schema_version="1.0",  # or "1.1" (default)
)

With "1.0", DENIED outcomes are downgraded to FAILURE for backward compatibility. See docs/migrating-1.0-to-1.1.md for a full migration guide.

Configuration

AuditConfig supports (frozen after creation):

Option Default Description
service_name (required) Name of the service emitting events
service_environment "unknown" Environment (prod, staging, dev)
service_version None Service version string
default_actor_id "unknown" Default actor when no auth context
default_actor_type "service" Default actor type ("human" or "service")
get_actor None Callback (Request) -> dict for custom actor extraction
get_action None Callback (Request) -> dict for custom action extraction
get_resource None Callback (Request, int) -> dict for custom resource extraction
get_metadata None Callback (Request, int) -> dict for custom metadata
metadata_allowlist frozenset() Allowed metadata keys (empty = no metadata)
excluded_paths frozenset({"/health", ...}) Paths to skip auditing
emit_failure_mode "log" How to handle sink failures
failure_logger_name "bh.audit.internal" Logger name for internal diagnostics
max_metadata_value_length 200 Max string length for metadata values
include_client_ip False Whether to include client IP
emit_mode "queue" "sync" or "queue" (non-blocking)
queue_size 10_000 Maximum pending events in queue
queue_drain_timeout 5.0 Seconds to wait for queue drain on shutdown
validate_events False Enable runtime JSON-schema validation
validation_failure_mode "drop" "drop", "log_and_emit", or "raise"
get_denial_reason None Callback (Request, exc_info) -> str|None for denial categorization
denied_status_codes frozenset({401, 403}) Status codes that produce DENIED outcome
target_schema_version "1.1" Schema version for emitted events ("1.0" or "1.1")

PHI-safe defaults

This library is designed to be safe by default:

  • No bodies: Never reads or logs request/response bodies
  • Route templates: Uses /patients/{id} not /patients/12345
  • Safe headers only: Only extracts correlation headers (no Authorization, Cookie)
  • Error sanitization: Exception messages are stripped of SSN/email/phone patterns and truncated

PHI safety is enforced by tests that assert synthetic PHI tokens never appear in emitted events.

Installation

Requires Python 3.11+

pip install bh-fastapi-audit

Optional dependencies

pip install bh-fastapi-audit[sqlalchemy]    # Database sink

Note: jsonschema is a required dependency as of v0.4.0 (runtime validation support).

Development installation

git clone https://github.com/bh-healthcare/bh-fastapi-audit
cd bh-fastapi-audit
python -m venv .venv
source .venv/bin/activate
pip install -e ".[dev,sqlalchemy,jsonschema]"

License

Apache 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bh_fastapi_audit-0.4.0.tar.gz (33.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

bh_fastapi_audit-0.4.0-py3-none-any.whl (35.2 kB view details)

Uploaded Python 3

File details

Details for the file bh_fastapi_audit-0.4.0.tar.gz.

File metadata

  • Download URL: bh_fastapi_audit-0.4.0.tar.gz
  • Upload date:
  • Size: 33.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for bh_fastapi_audit-0.4.0.tar.gz
Algorithm Hash digest
SHA256 4600d11eb314f008d58c0c583435d6f6947095a64b92a6d74890b54344dcd735
MD5 02fe753f983944146db44812fc2b6b5f
BLAKE2b-256 5a17f2232a5c6ce729f380289231a94cce79d635cb966816715dd65ae3fb8f6c

See more details on using hashes here.

Provenance

The following attestation bundles were made for bh_fastapi_audit-0.4.0.tar.gz:

Publisher: publish.yml on bh-healthcare/bh-fastapi-audit

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file bh_fastapi_audit-0.4.0-py3-none-any.whl.

File metadata

File hashes

Hashes for bh_fastapi_audit-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 197b875514667dc62ef3ccdf57b5a165e0bbe9c02ae3237c70f1b7c217af0c0a
MD5 5245e9aaf9dfa4e3af2f3d6ee2e89a46
BLAKE2b-256 0c84394dce0c49095f3300f71d2109c0d0f9f78f64842066838f1c6c8acb3979

See more details on using hashes here.

Provenance

The following attestation bundles were made for bh_fastapi_audit-0.4.0-py3-none-any.whl:

Publisher: publish.yml on bh-healthcare/bh-fastapi-audit

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page