Skip to main content

Python client for Hugr Arrow IPC over multipart/mixed

Project description

hugr-client

Python client for the Hugr Data Mesh platform. Query data via GraphQL, get results as Arrow tables, pandas DataFrames, or interactive Perspective viewers.

Uses the Hugr IPC protocol (multipart/mixed with Arrow IPC) for efficient data transfer.

Installation

pip install hugr-client

For interactive map visualizations (KeplerGL):

pip install hugr-client[viz]

Quick Start

from hugr import HugrClient

client = HugrClient()  # reads connection from ~/.hugr/connections.json
result = client.query("{ core { data_sources { name } } }")

# Interactive Perspective viewer in JupyterLab
result

# pandas DataFrame
df = result.df("data.core.data_sources")

# pyarrow Table (zero-copy, no pandas overhead)
table = result.parts["data.core.data_sources"].to_arrow()

Connection

From connections.json (recommended)

When using JupyterLab with hugr-kernel, connections are managed via the connection manager UI. hugr-client reads the same configuration:

# Default connection
client = HugrClient()

# Named connection
client = HugrClient.from_connection("production")

From environment variables

# Uses HUGR_URL, HUGR_API_KEY, HUGR_TOKEN env vars
client = HugrClient()
Variable Description
HUGR_URL Hugr server URL (e.g., http://localhost:15000/ipc)
HUGR_API_KEY API key for authentication
HUGR_TOKEN Bearer token for authentication
HUGR_API_KEY_HEADER Custom API key header name (default: X-Hugr-Api-Key)
HUGR_ROLE_HEADER Custom role header name (default: X-Hugr-Role)
HUGR_CONFIG_PATH Custom path to connections.json

Explicit parameters

client = HugrClient(
    url="http://localhost:15000/ipc",
    api_key="sk-...",
    api_key_header="X-Custom-Key",  # optional custom header
    role="analyst",
)

Priority: explicit parameters > environment variables > connections.json

Working with Results

Multipart responses

Hugr returns multipart responses with multiple data parts:

result = client.query("""
{
    devices { id name geom }
    drivers { id name }
}
""")

# Access individual parts
result.parts["data.devices"].df()
result.parts["data.drivers"].to_arrow()

# Display all parts (Perspective viewer in JupyterLab)
result

Data access methods

part = result.parts["data.devices"]

# pandas DataFrame
df = part.df()

# pyarrow Table (zero-copy)
table = part.to_arrow()

# GeoDataFrame (with geometry decoding)
gdf = part.to_geo_dataframe("geom")

# or via shortcut
gdf = result.gdf("data.devices", "geom")

# JSON record (for object parts)
record = result.record("data.drivers_by_pk")

Geometry support

Geometry fields are automatically detected from server metadata. Supported formats: WKB, GeoJSON, H3Cell.

# GeoDataFrame with CRS
gdf = result.gdf("data.devices", "geom")
print(gdf.crs)  # EPSG:4326

# Nested geometry (auto-flattens to target field)
gdf = result.gdf("data.drivers", "devices.geom")

# GeoJSON export
layers = result.geojson_layers()

Interactive visualization

With hugr-client[viz]:

result.explore_map()  # KeplerGL interactive map

In JupyterLab with hugr-perspective-viewer:

result  # renders as Perspective viewer with table/map/charts

Streaming API

For large datasets, use WebSocket streaming to process data in batches:

import asyncio
from hugr import connect_stream

async def main():
    client = connect_stream()

    # Stream Arrow batches
    async with await client.stream("{ devices { id name geom } }") as stream:
        async for batch in stream.chunks():
            print(f"Batch: {batch.num_rows} rows")

    # Collect into DataFrame
    async with await client.stream("{ devices { id name } }") as stream:
        df = await stream.to_pandas()

    # Row-by-row processing
    async with await client.stream("{ devices { id status } }") as stream:
        async for row in stream.rows():
            if row["status"] == "active":
                print(row["id"])

asyncio.run(main())

Stream methods

Method Description
stream.chunks() Async generator of Arrow RecordBatch
stream.rows() Async generator of dict rows
stream.to_pandas() Collect all batches into DataFrame
stream.count() Count total rows

Cancel long queries

async with await client.stream("{ large_dataset { ... } }") as stream:
    count = 0
    async for batch in stream.chunks():
        count += batch.num_rows
        if count > 10000:
            await client.cancel_current_query()
            break

ETL / Headless Usage

hugr-client works without Jupyter. No spool files, no display overhead:

from hugr import HugrClient

client = HugrClient()
result = client.query("{ data_source { id value } }")

# Pure data access — no side effects
table = result.to_arrow("data.data_source")  # pyarrow.Table
df = result.df("data.data_source")            # pandas.DataFrame

Dependencies

Required: requests, requests-toolbelt, pyarrow, pandas, numpy, geopandas, shapely, websockets

Optional ([viz]): keplergl, pydeck, folium, matplotlib, mapclassify

License

MIT License. See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hugr_client-0.2.2.tar.gz (22.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

hugr_client-0.2.2-py3-none-any.whl (22.0 kB view details)

Uploaded Python 3

File details

Details for the file hugr_client-0.2.2.tar.gz.

File metadata

  • Download URL: hugr_client-0.2.2.tar.gz
  • Upload date:
  • Size: 22.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for hugr_client-0.2.2.tar.gz
Algorithm Hash digest
SHA256 25f6fdb4576bef9a07e103f3ca195bbcd715f369bef61e26673839220214a800
MD5 b32dada2aa14b2e287f1d3fcd96763da
BLAKE2b-256 b577816bd7e549fce402ef69b79a670db8b2b9af006a84b9daed3abfbc68b06c

See more details on using hashes here.

File details

Details for the file hugr_client-0.2.2-py3-none-any.whl.

File metadata

  • Download URL: hugr_client-0.2.2-py3-none-any.whl
  • Upload date:
  • Size: 22.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for hugr_client-0.2.2-py3-none-any.whl
Algorithm Hash digest
SHA256 5f557c93a8dd8f90a80dbb2a6800671326c6aa7998dfcad3b27c084c4572c8a7
MD5 604edc41a12b035b7b565c4b4adf41a9
BLAKE2b-256 63acf253586a5266da3b426ec043e003e24d902fc8b3a9870759043b2fcd620b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page