Skip to main content

GraphQL service for arrow tables and parquet files.

Project description

image image image image image image image image image image

GraphQL service for arrow tables and parquet files. The schema is derived automatically.

Usage

% env PARQUET_PATH=... uvicorn graphique.service:app [--reload]

Open http://localhost:8000/graphql to try out the API in GraphiQL. There is a test fixture at ./tests/fixtures/zipcodes.parquet.

Configuration

Graphique uses Starlette's config: in environment variables or a .env file. Config variables are used as input to ParquetDataset.

  • COLUMNS = None
  • DEBUG = False
  • DICTIONARIES = None
  • INDEX = None
  • MMAP = True
  • PARQUET_PATH

Queries

A Table is the primary interface. It has fields for filtering, sorting, and grouping.

"""a column-oriented table"""
type Table {
  """number of rows"""
  length: Long!

  """fields for each column"""
  columns: Columns!

  """Return scalar values at index."""
  row(index: Long! = 0): Row!

  """Return table slice."""
  slice(offset: Long! = 0, length: Long): Table!

  """
  Return tables grouped by columns, with stable ordering.
          `length` is the maximum number of tables to return.
          `count` filters and sorts tables based on the number of rows within each table.
  """
  group(by: [String!]!, reverse: Boolean! = false, length: Long, count: LongReduce): [Table!]!

  """
  Return table of first or last occurrences grouped by columns, with stable ordering.
  """
  unique(by: [String!]!, reverse: Boolean! = false): Table!

  """Return table slice sorted by specified columns."""
  sort(by: [String!]!, reverse: Boolean! = false, length: Long): Table!

  """Return table with minimum values per column."""
  min(by: [String!]!): Table!

  """Return table with maximum values per column."""
  max(by: [String!]!): Table!

  """
  Return table with rows which match all (by default) queries.
          `invert` optionally excludes matching rows.
          `reduce` is the binary operator to combine filters; within a column all predicates must match.
  """
  filter(query: Filters!, invert: Boolean! = false, reduce: Operator! = AND): Table!

Performance

Graphique relies on native pyarrow routines wherever possible. Otherwise it falls back to using NumPy, with zero-copy views. Graphique also has custom optimizations for grouping, dictionary-encoded arrays, and chunked arrays.

Specifying an INDEX of columns indicates the table is sorted, and enables a binary search interface.

  """
  Return table with matching values for compound `index`.
          Queries must be a prefix of the `index`.
          Only one non-equal query is allowed, and applied last.
  """
  search(...): Table!

Installation

% pip install graphique

Dependencies

  • pyarrow >=1
  • strawberry-graphql >=0.30
  • pytz (optional timestamp support)

Tests

100% branch coverage.

% pytest [--cov]

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

graphique-0.1.zip (140.9 kB view hashes)

Uploaded Source

Built Distributions

graphique-0.1-cp38-cp38-manylinux2014_x86_64.whl (478.8 kB view hashes)

Uploaded CPython 3.8

graphique-0.1-cp38-cp38-macosx_10_15_x86_64.whl (92.6 kB view hashes)

Uploaded CPython 3.8 macOS 10.15+ x86-64

graphique-0.1-cp37-cp37m-manylinux2014_x86_64.whl (437.7 kB view hashes)

Uploaded CPython 3.7m

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page