Cloud-agnostic Python audit logger for emitting PHI-safe behavioral healthcare audit events conforming to bh-audit-schema v1.0
Project description
bh-audit-logger
Cloud-agnostic Python utilities for emitting privacy-preserving audit events for behavioral healthcare systems.
Events conform to bh-audit-schema v1.0: https://github.com/bh-healthcare/bh-audit-schema
Why
Audit logging in healthcare is often inconsistent across services and jobs. This library provides a small, boring, correct baseline for emitting structured audit events from any Python code — Lambdas, workers, CLIs, ETL jobs, cron scripts — without logging raw PHI.
It is not tied to FastAPI (see bh-fastapi-audit for middleware-based logging).
Quickstart
pip install bh-audit-logger
from bh_audit_logger import AuditLogger, AuditLoggerConfig
logger = AuditLogger(
config=AuditLoggerConfig(
service_name="overstory-datalake",
service_environment="prod",
)
)
logger.audit(
"READ",
actor={"subject_id": "service_lambda", "subject_type": "service"},
resource={"type": "Patient", "id": "patient_123"},
outcome={"status": "SUCCESS"},
correlation={"request_id": "req_abc"},
)
By default, events are emitted as one compact JSON line via Python logging (stdout-friendly).
Example output
{"schema_version":"1.0","event_id":"6d3f0f6b-0c1a-4b9f-9d6f-9f6f7f5b2b0a","timestamp":"2026-02-17T12:00:00Z","service":{"name":"overstory-datalake","environment":"prod"},"actor":{"subject_id":"service_lambda","subject_type":"service"},"action":{"type":"READ","data_classification":"UNKNOWN"},"resource":{"type":"Patient","id":"patient_123"},"outcome":{"status":"SUCCESS"},"correlation":{"request_id":"req_abc"}}
Production usage: container logging
from bh_audit_logger import AuditLogger, AuditLoggerConfig, LoggingSink
logger = AuditLogger(
config=AuditLoggerConfig(
service_name="my-service",
service_environment="prod",
),
sink=LoggingSink(logger_name="bh.audit", level="INFO"),
)
Works anywhere stdout is collected: CloudWatch, GCP Cloud Logging, Azure Monitor, Kubernetes logging pipelines.
AWS Lambda / serverless
import json
import logging
from bh_audit_logger import AuditLogger, AuditLoggerConfig, LoggingSink
# Configure root logger for structured JSON to stdout (CloudWatch picks this up)
logging.basicConfig(level=logging.INFO)
audit = AuditLogger(
config=AuditLoggerConfig(
service_name="patient-export-lambda",
service_environment="prod",
service_version="2026.02.17.1",
),
sink=LoggingSink(logger_name="bh.audit", level="INFO"),
)
def handler(event, context):
audit.audit_access(
"EXPORT",
actor={"subject_id": "service_lambda", "subject_type": "service"},
resource={"type": "PatientExport", "id": event.get("export_id", "unknown")},
phi_touched=True,
data_classification="PHI",
correlation={"request_id": context.aws_request_id},
)
# ... do work ...
Each invocation emits one compact JSON line to stdout. Most platforms ingest stdout by default; configure your runtime logging pipeline as needed.
Production hardening
Sink failure isolation
By default, sink failures are logged but never propagate to your application logic:
config = AuditLoggerConfig(
service_name="my-service",
emit_failure_mode="log", # "silent", "log" (default), or "raise"
failure_logger_name="bh.audit.internal",
)
"silent"— swallow errors, increment counter only"log"— log a compact summary (event_id, service, action, resource) without the full payload"raise"— re-raise the original exception (use in dev/test)
Metadata restrictions
Metadata values are enforced to be scalar JSON types (str, int, float, bool, None). Dict, list, and tuple values are silently dropped. Long strings are truncated:
config = AuditLoggerConfig(
service_name="my-service",
metadata_allowlist={"batch_id", "region"},
max_metadata_value_length=200, # default; truncated strings end with "..."
)
Internal counters
Track emission health via lightweight counters:
logger = AuditLogger(config=config)
# ... emit events ...
print(logger.stats.snapshot())
# {"events_emitted_total": 42, "emit_failures_total": 0, "events_dropped_total": 0, "validation_failures_total": 0}
Synchronous emission
Audit emission is synchronous in v0.2.x. For high-throughput systems, use LoggingSink (which defers I/O to your logging pipeline) or plan for async sinks in v0.3.
Sinks
| Sink | Use case | Notes |
|---|---|---|
LoggingSink (default) |
Production | One compact JSON line per event via Python logging; stdout-friendly |
JsonlFileSink |
Local dev, demos | Appends to a .jsonl file; thread-safe, flush-on-write by default |
MemorySink |
Tests | Stores events in a list; use len(sink) and sink.events in assertions |
Pass any sink to AuditLogger(config=..., sink=...). Omit sink to get LoggingSink by default.
Configuration
AuditLoggerConfig fields:
| Field | Type | Default | Description |
|---|---|---|---|
service_name |
str |
required | Name of the service emitting events |
service_environment |
str |
"unknown" |
Deployment environment (prod, staging, dev) |
service_version |
str | None |
None |
Service version/build identifier |
default_actor_id |
str |
"unknown" |
Default actor when none provided |
default_actor_type |
str |
"service" |
Default actor type (human/service) |
metadata_allowlist |
set[str] |
set() |
Allowed metadata keys (empty = no metadata) |
sanitize_errors |
bool |
True |
Sanitize error messages (redact SSN/email/phone) |
error_message_max_len |
int |
200 |
Max length for sanitized error messages |
time_source |
Callable |
utcnow |
Injectable time source for testing |
id_factory |
Callable |
uuid4 |
Injectable ID factory for testing |
schema_version |
str |
"1.0" |
Locked to 1.0 unless overridden |
PHI-safe by default (via allowlists and error sanitization)
- No request/response bodies — the library never tries to capture payloads
- Metadata is opt-in and strictly allowlisted — only keys in
metadata_allowlistpass through; values must be scalar JSON types (str, int, float, bool, null) - Error messages are sanitized — SSN, email, phone patterns are redacted and messages are length-capped
- PHI safety is enforced by tests that assert synthetic PHI tokens never appear in emitted events
Important: This library does not attempt to detect or remove PHI from user-supplied IDs or free-text fields beyond the configured allowlist and error-message sanitization. Treat resource IDs (e.g.
patient_id) as sensitive and prefer surrogate identifiers wherever possible. The goal is safe defaults, not total PHI stripping.
Do not do this
# BAD: patient name in metadata
logger.audit("READ", resource={"type": "Patient"}, metadata={"patient_name": "Jane Doe"})
# BAD: full stack trace in error (may contain PHI from variables)
logger.audit("READ", resource={"type": "Patient"}, error=traceback.format_exc())
# BAD: MRN or SSN as a resource ID
logger.audit("READ", resource={"type": "Patient", "id": "123-45-6789"})
Instead, use surrogate IDs, keep metadata to operational keys (job name, batch ID, region), and let sanitize_errors=True (the default) handle error messages.
Schema conformance
All events conform to bh-audit-schema v1.0. Required fields:
schema_version="1.0"event_id(UUID)timestamp(UTC ISO 8601)service(name, environment)actor(subject_id, subject_type)action(type)resource(type)outcome(status)
Optional schema validation
pip install bh-audit-logger[jsonschema]
from bh_audit_logger import validate_event
event = {...}
validate_event(event) # raises ValidationError on failure
Validates against the vendored bh-audit-schema v1.0 JSON schema included in the package.
Related projects
- bh-audit-schema: github.com/bh-healthcare/bh-audit-schema — the schema standard
- bh-fastapi-audit: github.com/bh-healthcare/bh-fastapi-audit — FastAPI middleware for automatic audit logging
License
Apache 2.0
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file bh_audit_logger-0.2.0.tar.gz.
File metadata
- Download URL: bh_audit_logger-0.2.0.tar.gz
- Upload date:
- Size: 21.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e924a8728898cd7cd5193dcf07421ec5934052cb3028c36e289e539f717794f4
|
|
| MD5 |
8cbf4f09e7ff10f35feea66b95aed712
|
|
| BLAKE2b-256 |
ce4d95a3983d837973d639a0f870351db182f79d2f42e182adb1036dca7968e9
|
Provenance
The following attestation bundles were made for bh_audit_logger-0.2.0.tar.gz:
Publisher:
publish.yml on bh-healthcare/bh-audit-logger
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
bh_audit_logger-0.2.0.tar.gz -
Subject digest:
e924a8728898cd7cd5193dcf07421ec5934052cb3028c36e289e539f717794f4 - Sigstore transparency entry: 1091439878
- Sigstore integration time:
-
Permalink:
bh-healthcare/bh-audit-logger@2988559b1097fbe717a65664183c6e7a0024cb09 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/bh-healthcare
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@2988559b1097fbe717a65664183c6e7a0024cb09 -
Trigger Event:
push
-
Statement type:
File details
Details for the file bh_audit_logger-0.2.0-py3-none-any.whl.
File metadata
- Download URL: bh_audit_logger-0.2.0-py3-none-any.whl
- Upload date:
- Size: 22.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3b0c3bbc77346cf0460b49cb7f675345306fa89b23ecc2c5e0d2191fad3e0509
|
|
| MD5 |
3db712c0f86a3594b27224d40f4ac994
|
|
| BLAKE2b-256 |
33dd0f2a8babcc258f1285d435b7bc6c4dadf20ee192a3f0eb9bd45d56e82125
|
Provenance
The following attestation bundles were made for bh_audit_logger-0.2.0-py3-none-any.whl:
Publisher:
publish.yml on bh-healthcare/bh-audit-logger
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
bh_audit_logger-0.2.0-py3-none-any.whl -
Subject digest:
3b0c3bbc77346cf0460b49cb7f675345306fa89b23ecc2c5e0d2191fad3e0509 - Sigstore transparency entry: 1091439880
- Sigstore integration time:
-
Permalink:
bh-healthcare/bh-audit-logger@2988559b1097fbe717a65664183c6e7a0024cb09 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/bh-healthcare
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@2988559b1097fbe717a65664183c6e7a0024cb09 -
Trigger Event:
push
-
Statement type: