Skip to main content

Python client for the Netrias harmonization API

Project description

Netrias Client

Python toolkit for working with the Netrias recommendation and harmonization services. The client wraps the HTTP APIs with strong typing, logging, and guard rails so analytics code can focus on describing data rather than orchestrating requests.

Highlights

  • Stateful client facade – instantiate NetriasClient and call client.configure(...) once.
  • Column discovery helpers – derive column samples from CSV files, invoke the recommendation service, and normalize responses into MappingDiscoveryResult models.
  • Adapter utilities – convert discovery output into harmonization-ready manifest payloads while applying confidence filters and CDE overrides.
  • Asynchronous harmonization loop – submit jobs, poll for completion, download results, and version output files automatically to avoid accidental overwrites.
  • Extended timing logs – discovery and harmonization emit duration metrics so you can spot slow calls quickly during live runs.

Installation

The project targets Python 3.12+.

pip install netrias_client

# optional AWS helpers (gateway bypass)
pip install netrias_client[aws]

We recommend managing environments with uv:

# create or update a project that depends on netrias_client
uv add netrias_client

# install optional AWS helpers (gateway bypass)
uv add netrias_client[aws]

For local development within this repository:

uv sync --group dev              # install development tooling
uv sync --group aws --group dev  # include optional AWS dependencies

Configuration

All client entry points require explicit configuration. Create a NetriasClient, then provide the API key; discovery and harmonization endpoints remain fixed by the library.

from pathlib import Path

from netrias_client import NetriasClient
from netrias_client._models import LogLevel

client = NetriasClient()
client.configure(
    api_key="<netrias api key>",
    # Optional overrides:
    timeout=21600.0,               # seconds (default: 6 hours)
    log_level=LogLevel.INFO,
    confidence_threshold=0.80,     # discovery adapter filter, 0.0–1.0
    discovery_use_gateway_bypass=True,  # toggle Lambda bypass (default: True)
    log_directory=Path("logs/netrias"),  # optional per-client log files
)

Configuration errors raise ClientConfigurationError. Calling configure again replaces the active settings snapshot and reinitializes the dedicated logger (refreshing file handlers when log_directory is supplied).

End-to-End Workflow

The typical harmonization flow contains three steps:

from pathlib import Path

from netrias_client import NetriasClient

client = NetriasClient()
client.configure(api_key="<netrias api key>")

csv_path = Path("/path/to/source.csv")
schema = "ccdi"

# 1. Ask the recommendation service for potential targets.
manifest_payload = client.discover_mapping_from_csv(
    source_csv=csv_path,
    target_schema=schema,
)

# 2. Kick off harmonization directly with the manifest payload.
result = client.harmonize(source_path=csv_path, manifest=manifest_payload)
print(result.status)
print(result.description)
print(result.file_path)
  • client.discover_mapping_from_csv(...) samples up to 25 values per column (configurable), calls the API, and returns a manifest-ready payload (including static metadata such as CDE routes/IDs where configured).
  • client.harmonize(...) submits a job and polls GET /v1/jobs/{jobId} until the backend returns success or failure. Downloaded CSVs are written next to the source file (versioned as data.harmonized.v1.csv, etc.). Pass manifest_output_path= if you also want to persist the manifest JSON for inspection.

Timing Logs

Both discovery and harmonization log elapsed seconds for the full operation and for timeout/transport failures. Sample output:

INFO netrias_client: discover mapping start: schema=ccdi columns=12
INFO netrias_client: discover mapping complete: schema=ccdi suggestions=0 duration=47.12s
INFO netrias_client: harmonize start: file=data.csv
INFO netrias_client: harmonize finished: file=data.csv status=succeeded duration=182.45s

Use these metrics to separate slow API responses from downstream processing overhead.

Adapter Notes

Discovery results are normalized to manifest payloads automatically; unmatched columns are logged so you can expand coverage. Confidence thresholds come from configure(confidence_threshold=...) and default to 0.8.

Gateway Bypass (Temporary)

The module netrias_client._gateway_bypass exposes invoke_cde_recommendation_alias(...), a stopgap helper that calls the cde-recommendation Lambda alias directly. This avoids API Gateway’s short timeout window but requires AWS credentials with lambda:InvokeFunction permission and the boto3 dependency.

from netrias_client._gateway_bypass import invoke_cde_recommendation_alias

result = invoke_cde_recommendation_alias(
    target_schema="ccdi",
    columns={"study_name": ["foo", "bar"]},
    alias="prod",
    region_name="us-east-2",
)

Install boto3 (or netrias-client[aws] if provided) before importing the bypass module, and rotate IAM credentials frequently. Once API Gateway limits are raised, prefer the standard discovery flow again.

Testing & Tooling

The repository ships with pytest-based integration tests plus lint/type tooling.

uv run pytest
uv run ruff check
uv run basedpyright
uv build                 # produce wheel + sdist

Live verification scripts are located under live_test/ and require a populated .env file containing NETRIAS_API_KEY (and optionally harmonization overrides while services converge).

Project Layout

src/netrias_client/
    __init__.py          # re-exported public surface
    _adapter.py          # discovery → manifest conversion
    _client.py           # NetriasClient facade and state management
    _config.py           # settings validation helpers
    _core.py             # harmonization workflow
    _discovery.py        # discovery wrappers and CSV sampling
    _errors.py           # exception taxonomy
    _http.py             # HTTP primitives (submit/poll/download)
    _io.py               # streaming helpers
    _logging.py          # standardized logger setup
    _models.py           # dataclasses for structured responses
    _validators.py       # filesystem and payload validation

Tests reside under src/netrias_client/tests/ and are excluded from the published wheel to keep installs slim; run them locally via uv run pytest.

Contributing

  1. uv sync --group dev (add --group aws if needed) to create the virtual environment.
  2. uv run pytest to ensure the suite passes prior to committing.
  3. Follow the repo conventions: keep functions focused, prefer typed interfaces, and favor logging key transitions over verbose chatter.

Pull requests should include updated documentation or fixtures when they alter API behavior or the manifest contract.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

netrias_client-0.0.1.tar.gz (26.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

netrias_client-0.0.1-py3-none-any.whl (32.7 kB view details)

Uploaded Python 3

File details

Details for the file netrias_client-0.0.1.tar.gz.

File metadata

  • Download URL: netrias_client-0.0.1.tar.gz
  • Upload date:
  • Size: 26.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.22

File hashes

Hashes for netrias_client-0.0.1.tar.gz
Algorithm Hash digest
SHA256 e40db4f00b6d81bb452e47751afb09aee2a6f612f0816bf71ac716fe11a0b8ed
MD5 72b418e7e58b637d15a39424903afd81
BLAKE2b-256 1ffdcba9f623e185d5f77b2beecaa0d0a1e841fadbd8e8e93b88b93ba120c48e

See more details on using hashes here.

File details

Details for the file netrias_client-0.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for netrias_client-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 b99df23c98176861426e2a2b3aea25439ef6b5c7700e15b06f3117d3ffcce547
MD5 ca52a0933ba9a3828d36afaeeec45190
BLAKE2b-256 3e3ea8c5f024d97e3364ce46498608f3c688cd667534bd2f054d7cff7ffb1edc

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page