Skip to main content

Python client for Hugr Arrow IPC over multipart/mixed

Project description

hugr-client

Python client for the Hugr Data Mesh platform. Query data via GraphQL, get results as Arrow tables, pandas DataFrames, or interactive Perspective viewers.

Uses the Hugr IPC protocol (multipart/mixed with Arrow IPC) for efficient data transfer.

Installation

pip install hugr-client

For interactive map visualizations (KeplerGL):

pip install hugr-client[viz]

Quick Start

from hugr import HugrClient

client = HugrClient()  # reads connection from ~/.hugr/connections.json
result = client.query("{ core { data_sources { name } } }")

# Interactive Perspective viewer in JupyterLab
result

# pandas DataFrame
df = result.df("data.core.data_sources")

# pyarrow Table (zero-copy, no pandas overhead)
table = result.parts["data.core.data_sources"].to_arrow()

Connection

From connections.json (recommended)

When using JupyterLab with hugr-kernel, connections are managed via the connection manager UI. hugr-client reads the same configuration:

# Default connection
client = HugrClient()

# Named connection
client = HugrClient.from_connection("production")

From environment variables

# Uses HUGR_URL, HUGR_API_KEY, HUGR_TOKEN env vars
client = HugrClient()
Variable Description
HUGR_URL Hugr server URL (e.g., http://localhost:15000/ipc)
HUGR_API_KEY API key for authentication
HUGR_TOKEN Bearer token for authentication
HUGR_API_KEY_HEADER Custom API key header name (default: X-Hugr-Api-Key)
HUGR_ROLE_HEADER Custom role header name (default: X-Hugr-Role)
HUGR_CONFIG_PATH Custom path to connections.json

Explicit parameters

client = HugrClient(
    url="http://localhost:15000/ipc",
    api_key="sk-...",
    api_key_header="X-Custom-Key",  # optional custom header
    role="analyst",
)

Priority: explicit parameters > environment variables > connections.json

Working with Results

Multipart responses

Hugr returns multipart responses with multiple data parts:

result = client.query("""
{
    devices { id name geom }
    drivers { id name }
}
""")

# Access individual parts
result.parts["data.devices"].df()
result.parts["data.drivers"].to_arrow()

# Display all parts (Perspective viewer in JupyterLab)
result

Data access methods

part = result.parts["data.devices"]

# pandas DataFrame
df = part.df()

# pyarrow Table (zero-copy)
table = part.to_arrow()

# GeoDataFrame (with geometry decoding)
gdf = part.to_geo_dataframe("geom")

# or via shortcut
gdf = result.gdf("data.devices", "geom")

# JSON record (for object parts)
record = result.record("data.drivers_by_pk")

Geometry support

Geometry fields are automatically detected from server metadata. Supported formats: WKB, GeoJSON, H3Cell.

# GeoDataFrame with CRS
gdf = result.gdf("data.devices", "geom")
print(gdf.crs)  # EPSG:4326

# Nested geometry (auto-flattens to target field)
gdf = result.gdf("data.drivers", "devices.geom")

# GeoJSON export
layers = result.geojson_layers()

Interactive visualization

With hugr-client[viz]:

result.explore_map()  # KeplerGL interactive map

In JupyterLab with hugr-perspective-viewer:

result  # renders as Perspective viewer with table/map/charts

Streaming API

For large datasets, use WebSocket streaming to process data in batches:

import asyncio
from hugr import connect_stream

async def main():
    client = connect_stream()

    # Stream Arrow batches
    async with await client.stream("{ devices { id name geom } }") as stream:
        async for batch in stream.chunks():
            print(f"Batch: {batch.num_rows} rows")

    # Collect into DataFrame
    async with await client.stream("{ devices { id name } }") as stream:
        df = await stream.to_pandas()

    # Row-by-row processing
    async with await client.stream("{ devices { id status } }") as stream:
        async for row in stream.rows():
            if row["status"] == "active":
                print(row["id"])

asyncio.run(main())

Stream methods

Method Description
stream.chunks() Async generator of Arrow RecordBatch
stream.rows() Async generator of dict rows
stream.to_pandas() Collect all batches into DataFrame
stream.count() Count total rows

Cancel long queries

async with await client.stream("{ large_dataset { ... } }") as stream:
    count = 0
    async for batch in stream.chunks():
        count += batch.num_rows
        if count > 10000:
            await client.cancel_current_query()
            break

ETL / Headless Usage

hugr-client works without Jupyter. No spool files, no display overhead:

from hugr import HugrClient

client = HugrClient()
result = client.query("{ data_source { id value } }")

# Pure data access — no side effects
table = result.to_arrow("data.data_source")  # pyarrow.Table
df = result.df("data.data_source")            # pandas.DataFrame

Dependencies

Required: requests, requests-toolbelt, pyarrow, pandas, numpy, geopandas, shapely, websockets

Optional ([viz]): keplergl, pydeck, folium, matplotlib, mapclassify

License

MIT License. See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hugr_client-0.2.0.tar.gz (21.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

hugr_client-0.2.0-py3-none-any.whl (21.7 kB view details)

Uploaded Python 3

File details

Details for the file hugr_client-0.2.0.tar.gz.

File metadata

  • Download URL: hugr_client-0.2.0.tar.gz
  • Upload date:
  • Size: 21.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for hugr_client-0.2.0.tar.gz
Algorithm Hash digest
SHA256 35589deb90f71868fe8285178738a671a14d39cde3056b9bac828632ce4c1637
MD5 fa255d7cec3c4195e57eb6a56c7d0ecf
BLAKE2b-256 7537c109b97ddcad8e55c5bad2fcb0a351467ce137c4a5c240c638a1d1b0f399

See more details on using hashes here.

File details

Details for the file hugr_client-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: hugr_client-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 21.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for hugr_client-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 29f1acf4e25e4a86f1b829a537e23406f1855f98e0413a73e1b7dcc3fb92e315
MD5 e172cf857f72f1544aae0af9459d6702
BLAKE2b-256 7e1900e8021c322b5db83b684ee2ff5fc9428c494471023806cf65b2a9b202ab

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page