GraphQL service for python dataframes and parquet datasets.
Project description
GraphQL service for ibis dataframes, arrow tables, and parquet datasets. The schema for a query API is derived automatically.
Version 2
When this project started, there was no out-of-core execution engine with performance comparable to PyArrow. So it effectively included one, based on datasets and Acero.
Since then the ecosystem has grown considerably: DuckDB, DataFusion, and Ibis. As of version 2, graphique is based on ibis. It provides a common dataframe API for multiple backends, enabling graphique to also have a default but configurable backend.
Being a major version upgrade, there are incompatible changes from version 1. However the overall API remains largely the same.
Usage
There is an example app which reads a parquet dataset.
env PARQUET_PATH=... uvicorn graphique.service:app
Open http://localhost:8000/ to try out the API in GraphiQL. There is a test fixture at ./tests/fixtures/zipcodes.parquet.
env PARQUET_PATH=... strawberry export-schema graphique.service:app.schema
outputs the graphql schema.
Configuration
The example app uses Starlette's config: in environment variables or a .env file.
- PARQUET_PATH: path to the parquet directory or file
- FEDERATED = '': field name to extend type
Querywith a federatedTable - METRICS = False: include timings from apollo tracing extension
- COLUMNS = None: list of names, or mapping of aliases, of columns to select
- FILTERS = None: json
filterquery for which rows to read at startup
Configuration options exist to provide a convenient no-code solution, but are subject to change in the future. Using a custom app is recommended for production usage.
App
For more options create a custom ASGI app. Call graphique's GraphQL on an ibis Table or arrow Dataset.
Supply a mapping of names to datasets for multiple roots, and to enable federation.
import ibis
from graphique import GraphQL
source = ibis.read_*(...) # or ibis.connect(...).table(...) or pyarrow.dataset.dataset(...)
# apply initial projections or filters to `source`
app = GraphQL(source) # Table is root query type
app = GraphQL.federated({<name>: source, ...}, keys={<name>: [], ...}) # Tables on federated fields
Start like any ASGI app.
uvicorn <module>:app
API
types
Dataset: interface for an ibis table or arrow dataset.Table: implements theDatasetinterface. Adds typedrow,columns, andfilterfields from introspecting the schema.Column: interface for an ibis column. Each data type has a corresponding column implementation: Boolean, Int, BigInt, Float, Decimal, Date, Datetime, Time, Duration, Base64, String, Array, Struct. All columns have avaluesfield for their list of scalars. Additional fields vary by type.Row: scalar fields. Tables are column-oriented, and graphique encourages that usage for performance. A singlerowfield is provided for convenience, but a field for a list of rows is not. Requesting parallel columns is far more efficient.
selection
slice: contiguous selection of rowsfilter: select rows by predicatesjoin: join tables by key columnstake: rows by indexdropNull: remove rows with nulls
projection
project: project columns with expressionscolumns: provides a field for everyColumnin the schemacolumn: access a column of any type by namerow: provides a field for each scalar of a single rowcast: cast column typesfillNull: fill null values
aggregation
group: group by given columns, and aggregate the othersdistinct: group with all columnsruns: provisionally group by adjacencyunnest: unnest an array columncount: number of rows
ordering
order: sort table by given columns- options
limitanddense: select rows with smallest or largest values
Performance
Performance is dependent on the ibis backend, which defaults to duckdb. There are no internal Python loops. Scalars do not become Python types until serialized.
PyArrow is also used for partitioned dataset optimizations, and for any feature which ibis does not support. Table fields are lazily evaluated up until scalars are reached, and automatically cached as needed for multiple fields.
Installation
pip install graphique[server]
Dependencies
- ibis-framework (with duckdb or other backend)
- strawberry-graphql[asgi,cli]
- pyarrow
- isodate
- uvicorn (or other ASGI server)
Tests
100% branch coverage.
pytest [--cov]
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file graphique-2.0.2.tar.gz.
File metadata
- Download URL: graphique-2.0.2.tar.gz
- Upload date:
- Size: 29.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ad73af2d64c470618a4d153eebff26c066333c5e83eda42a8b2f165b5b91c0ea
|
|
| MD5 |
f5e8e75829fedca96c8079911fadcdd0
|
|
| BLAKE2b-256 |
beb9490a5c46c1f6bc71f6360188ac46c628319f5a1fec79110563161c1d0660
|
Provenance
The following attestation bundles were made for graphique-2.0.2.tar.gz:
Publisher:
release.yml on coady/graphique
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
graphique-2.0.2.tar.gz -
Subject digest:
ad73af2d64c470618a4d153eebff26c066333c5e83eda42a8b2f165b5b91c0ea - Sigstore transparency entry: 813482447
- Sigstore integration time:
-
Permalink:
coady/graphique@fbb14f0f086d44bdb63a66a55f355af6c3b23639 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/coady
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@fbb14f0f086d44bdb63a66a55f355af6c3b23639 -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file graphique-2.0.2-py3-none-any.whl.
File metadata
- Download URL: graphique-2.0.2-py3-none-any.whl
- Upload date:
- Size: 23.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0a921f3f3c88a49a30bf5aabe8e5d62691ce42af0691c4f04fdd9a3c47e7ffd3
|
|
| MD5 |
0d15379388fa59527f00344b3dce0284
|
|
| BLAKE2b-256 |
40866c75497a80ab51aeecfd1f96b1d4ffce0ea695b841c9782d4675c2177b47
|
Provenance
The following attestation bundles were made for graphique-2.0.2-py3-none-any.whl:
Publisher:
release.yml on coady/graphique
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
graphique-2.0.2-py3-none-any.whl -
Subject digest:
0a921f3f3c88a49a30bf5aabe8e5d62691ce42af0691c4f04fdd9a3c47e7ffd3 - Sigstore transparency entry: 813482448
- Sigstore integration time:
-
Permalink:
coady/graphique@fbb14f0f086d44bdb63a66a55f355af6c3b23639 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/coady
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@fbb14f0f086d44bdb63a66a55f355af6c3b23639 -
Trigger Event:
workflow_dispatch
-
Statement type: