GraphQL service for arrow tables and parquet data sets.
Project description
GraphQL service for arrow tables and parquet data sets. The schema is derived automatically.
Usage
% env PARQUET_PATH=... uvicorn graphique.service:app
Open http://localhost:8000/graphql to try out the API in GraphiQL. There is a test fixture at ./tests/fixtures/zipcodes.parquet
.
Configuration
Graphique uses Starlette's config: in environment variables or a .env
file. Config variables are used as input to ParquetDataset.
- COLUMNS = None
- DEBUG = False
- DICTIONARIES = None
- INDEX = None
- MMAP = True
- PARQUET_PATH
Queries
A Table
is the primary interface. It has fields for filtering, sorting, and grouping.
"""a column-oriented table"""
type Table {
"""number of rows"""
length: Long!
"""column names"""
names: [String!]!
"""fields for each column"""
columns: Columns!
"""
Return column of any type by name.
This is typically only needed for aliased columns added by `apply` or `Groups.aggregate`.
If the column is in the schema, `columns` can be used instead.
"""
column(name: String!): Column!
"""Return scalar values at index."""
row(index: Long! = 0): Row!
"""Return table slice."""
slice(offset: Long! = 0, length: Long): Table!
"""
Return tables grouped by columns, with stable ordering.
`length` is the maximum number of tables to return.
`count` filters and sorts tables based on the number of rows within each table.
"""
group(by: [String!]!, reverse: Boolean! = false, length: Long, count: CountQuery): Groups!
"""
Return table of first or last occurrences grouped by columns, with stable ordering.
Optionally include counts in an aliased column.
Faster than `group` when only scalars are needed.
"""
unique(by: [String!]!, reverse: Boolean! = false, count: String! = ""): Table!
"""Return table slice sorted by specified columns."""
sort(by: [String!]!, reverse: Boolean! = false, length: Long): Table!
"""Return table with minimum values per column."""
min(by: [String!]!): Table!
"""Return table with maximum values per column."""
max(by: [String!]!): Table!
"""
Return table with rows which match all (by default) queries.
`invert` optionally excludes matching rows.
`reduce` is the binary operator to combine filters; within a column all predicates must match.
"""
filter(query: Filters!, invert: Boolean! = false, reduce: Operator! = AND): Table!
"""
Return view of table with functions applied across columns.
If no alias is provided, the column is replaced and should be of the same type.
If an alias is provided, a column is added and may be referenced in the `column` interface,
and in the `by` arguments of grouping and sorting.
"""
apply(...): Table!
}
Performance
Graphique relies on native pyarrow routines wherever possible. Otherwise it falls back to using NumPy, with zero-copy views. Graphique also has custom optimizations for grouping, dictionary-encoded arrays, and chunked arrays.
Specifying an INDEX
of columns indicates the table is sorted, and enables a binary search interface.
"""
Return table with matching values for compound `index`.
Queries must be a prefix of the `index`.
Only one non-equal query is allowed, and applied last.
"""
search(...): Table!
Installation
% pip install graphique
Dependencies
- pyarrow >=3
- strawberry-graphql >=0.42
- uvicorn (or other ASGI server)
- pytz (optional timestamp support)
Tests
100% branch coverage.
% pytest [--cov]
Changes
0.3
- Pyarrow >=3 required
any
andall
fields- Sting column
split
field
0.2
ListColumn
andStructColumn
typesGroups
type withaggregate
fieldgroup
andunique
optimized- pyarrow >= 2 required
- Statistical fields:
mode
,stddev
,variance
is_in
,min
, andmax
optimized
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for graphique-0.3-cp39-cp39-manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 45fd296de465ec0786a805328490b75553bc68bab0fb7e0338b05e550cdbe4d0 |
|
MD5 | 795e5c34dc55cc6b0bf574997c90df14 |
|
BLAKE2b-256 | 4bf7248ee1e714f6a7ef7f2e4e45f98c31cc332c06c5ac7644cfa4cf947eff68 |
Hashes for graphique-0.3-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 33144f901e4035a1faf21eaf51b9eddb446728b2edb9e177f0ec23ed2196e4be |
|
MD5 | 387299fbbe115251e71257b60821b48f |
|
BLAKE2b-256 | 86daef3efd3b67f852858c59c4e2dbed94a0c6d85bceb6af4b4bba868c6a10a7 |
Hashes for graphique-0.3-cp38-cp38-manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f320908e5b0c49b7849f4dfd0351d9dca5c47db217ccd82eda7f2eb83c1a556f |
|
MD5 | d169e9165fecab1b3a16cf25d593bd2d |
|
BLAKE2b-256 | c0f1a0485b9ceb5183e23cf9a756b8f1b7744168ca3bdf81a65c7f148345b9b4 |
Hashes for graphique-0.3-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8ff6a73b367e2987f58ffb948b0d477c8b2347a5c7b669756db228848d563cd0 |
|
MD5 | cf13b1947e81495dd9e8e3e5b3e9b1d0 |
|
BLAKE2b-256 | 32032f8239a15273ef7988fd1579ae89619bca1d5ab8637d220396e4e089b350 |
Hashes for graphique-0.3-cp37-cp37m-manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 083c6167a1937db126b217615556f976b67ffe3ba30ffaa3a8c914b5d6b9d1cf |
|
MD5 | 2f1a44730fe9236750803435969aa143 |
|
BLAKE2b-256 | bc7925e52f446115f663145976e1800c3be300eb34218b4aa93573af3ebea15e |
Hashes for graphique-0.3-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d175f35654227364192ca2eba08c677938e6352cb69437be97e399cbe2fd5ef6 |
|
MD5 | 8fc48c5cacb90b3aa719f18277809261 |
|
BLAKE2b-256 | 306c4ef5894fb87c34809d8136566b0610a3fb7c7d939df5cf1ad17224429140 |