Skip to main content

Python client for Hugr Arrow IPC over multipart/mixed

Project description

hugr-client

Python client for the Hugr Data Mesh platform. Query data via GraphQL, get results as Arrow tables, pandas DataFrames, or interactive Perspective viewers.

Uses the Hugr IPC protocol (multipart/mixed with Arrow IPC) for efficient data transfer.

Installation

pip install hugr-client

For interactive map visualizations (KeplerGL):

pip install hugr-client[viz]

Quick Start

from hugr import HugrClient

client = HugrClient()  # reads connection from ~/.hugr/connections.json
result = client.query("{ core { data_sources { name } } }")

# Interactive Perspective viewer in JupyterLab
result

# pandas DataFrame
df = result.df("data.core.data_sources")

# pyarrow Table (zero-copy, no pandas overhead)
table = result.parts["data.core.data_sources"].to_arrow()

Connection

From connections.json (recommended)

When using JupyterLab with hugr-kernel, connections are managed via the connection manager UI. hugr-client reads the same configuration:

# Default connection
client = HugrClient()

# Named connection
client = HugrClient.from_connection("production")

From environment variables

# Uses HUGR_URL, HUGR_API_KEY, HUGR_TOKEN env vars
client = HugrClient()
Variable Description
HUGR_URL Hugr server URL (e.g., http://localhost:15000/ipc)
HUGR_API_KEY API key for authentication
HUGR_TOKEN Bearer token for authentication
HUGR_API_KEY_HEADER Custom API key header name (default: X-Hugr-Api-Key)
HUGR_ROLE_HEADER Custom role header name (default: X-Hugr-Role)
HUGR_CONFIG_PATH Custom path to connections.json

Explicit parameters

client = HugrClient(
    url="http://localhost:15000/ipc",
    api_key="sk-...",
    api_key_header="X-Custom-Key",  # optional custom header
    role="analyst",
)

Priority: explicit parameters > environment variables > connections.json

Working with Results

Multipart responses

Hugr returns multipart responses with multiple data parts:

result = client.query("""
{
    devices { id name geom }
    drivers { id name }
}
""")

# Access individual parts
result.parts["data.devices"].df()
result.parts["data.drivers"].to_arrow()

# Display all parts (Perspective viewer in JupyterLab)
result

Data access methods

part = result.parts["data.devices"]

# pandas DataFrame
df = part.df()

# pyarrow Table (zero-copy)
table = part.to_arrow()

# GeoDataFrame (with geometry decoding)
gdf = part.to_geo_dataframe("geom")

# or via shortcut
gdf = result.gdf("data.devices", "geom")

# JSON record (for object parts)
record = result.record("data.drivers_by_pk")

Geometry support

Geometry fields are automatically detected from server metadata. Supported formats: WKB, GeoJSON, H3Cell.

# GeoDataFrame with CRS
gdf = result.gdf("data.devices", "geom")
print(gdf.crs)  # EPSG:4326

# Nested geometry (auto-flattens to target field)
gdf = result.gdf("data.drivers", "devices.geom")

# GeoJSON export
layers = result.geojson_layers()

Interactive visualization

With hugr-client[viz]:

result.explore_map()  # KeplerGL interactive map

In JupyterLab with hugr-perspective-viewer:

result  # renders as Perspective viewer with table/map/charts

Streaming API

For large datasets, use WebSocket streaming to process data in batches:

import asyncio
from hugr import connect_stream

async def main():
    client = connect_stream()

    # Stream Arrow batches
    async with await client.stream("{ devices { id name geom } }") as stream:
        async for batch in stream.chunks():
            print(f"Batch: {batch.num_rows} rows")

    # Collect into DataFrame
    async with await client.stream("{ devices { id name } }") as stream:
        df = await stream.to_pandas()

    # Row-by-row processing
    async with await client.stream("{ devices { id status } }") as stream:
        async for row in stream.rows():
            if row["status"] == "active":
                print(row["id"])

asyncio.run(main())

Stream methods

Method Description
stream.chunks() Async generator of Arrow RecordBatch
stream.rows() Async generator of dict rows
stream.to_pandas() Collect all batches into DataFrame
stream.count() Count total rows

Cancel long queries

async with await client.stream("{ large_dataset { ... } }") as stream:
    count = 0
    async for batch in stream.chunks():
        count += batch.num_rows
        if count > 10000:
            await client.cancel_current_query()
            break

ETL / Headless Usage

hugr-client works without Jupyter. No spool files, no display overhead:

from hugr import HugrClient

client = HugrClient()
result = client.query("{ data_source { id value } }")

# Pure data access — no side effects
table = result.to_arrow("data.data_source")  # pyarrow.Table
df = result.df("data.data_source")            # pandas.DataFrame

Dependencies

Required: requests, requests-toolbelt, pyarrow, pandas, numpy, geopandas, shapely, websockets

Optional ([viz]): keplergl, pydeck, folium, matplotlib, mapclassify

License

MIT License. See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hugr_client-0.3.0.tar.gz (23.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

hugr_client-0.3.0-py3-none-any.whl (23.8 kB view details)

Uploaded Python 3

File details

Details for the file hugr_client-0.3.0.tar.gz.

File metadata

  • Download URL: hugr_client-0.3.0.tar.gz
  • Upload date:
  • Size: 23.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for hugr_client-0.3.0.tar.gz
Algorithm Hash digest
SHA256 b59212bdcb6206ad93e3019cdbd729dc3374e86f86277f0884386897bda2ce8c
MD5 b5d47a7fe8b0083037999571616a6ce5
BLAKE2b-256 d79f22e8313cac4409025493e94d0699fbb41c559ec42b27767ac3892f585a5c

See more details on using hashes here.

File details

Details for the file hugr_client-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: hugr_client-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 23.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for hugr_client-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 7413ed46492f9e71758630619b8573f2de8acf5908d0490b3936e79cd078da9d
MD5 d1462c931f5fac074fbf23a16008828a
BLAKE2b-256 d91abb507bc8aafdc736b639134c1996f842ac3b208a3a0da14badce89b5ddf5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page