Skip to main content

Transform GraphQL queries into Pandas data-frames 🚀 🍊.

Project description

Pluck 🚀 🍊

Pluck is a GraphQL client that transforms queries into Pandas data-frames.

Installation

Install Pluck from PyPi:

pip install pluck-graphql

Introduction

The easiest way to get started is to use pluck.read_graphql to execute a query.

Let's read the first five SpaceX launches into a data-frame:

import pluck

SpaceX = "https://api.spacex.land/graphql"

query = """
{
  launches(limit: 5) {
    mission_name
    launch_date_local
    rocket {
      rocket_name
    }
  }
}
"""
frame, = pluck.read_graphql(query, url=SpaceX)
frame
launches.mission_name launches.launch_date_local launches.rocket.rocket_name
Thaicom 6 2014-01-06T14:06:00-04:00 Falcon 9
AsiaSat 6 2014-09-07T01:00:00-04:00 Falcon 9
OG-2 Mission 2 2015-12-22T21:29:00-04:00 Falcon 9
FalconSat 2006-03-25T10:30:00+12:00 Falcon 1
CRS-1 2012-10-08T20:35:00-04:00 Falcon 9

Implicit Mode

The query above uses implicit mode. This is where the entire response is normalized into a single data-frame and the nested fields are separated by a period.

The return value from read_graphql is an instance of PluckResponse. This object is iterable and enumerates the data-frames in the query. Because this query uses implicit mode, the iterator contains only a single data-frame (note that the trailing comma is still required).

@frame directive

But Pluck is more powerful than implicit mode because it provides a custom @frame directive.

The @frame directive specifies portions of the GraphQL response that we want to transform into data-frames. The directive is removed before the query is sent to the GraphQL server.

Using the same query, rather than use implicit mode, let's pluck the launches field from the response:

query = """
{
  launches(limit: 5) @frame {
    mission_name
    launch_date_local
    rocket {
      rocket_name
    }
  }
}
"""
launches, = pluck.read_graphql(query, url=SpaceX)
launches
mission_name launch_date_local rocket.rocket_name
Thaicom 6 2014-01-06T14:06:00-04:00 Falcon 9
AsiaSat 6 2014-09-07T01:00:00-04:00 Falcon 9
OG-2 Mission 2 2015-12-22T21:29:00-04:00 Falcon 9
FalconSat 2006-03-25T10:30:00+12:00 Falcon 1
CRS-1 2012-10-08T20:35:00-04:00 Falcon 9

The column names are no longer prefixed with launches because it is now the root of the data-frame.

Multiple @frame directives

We can also pluck multiple data-frames from the a single GraphQL query.

Let's query the first five SpaceX rockets as well:

query = """
{
  launches(limit: 5) @frame {
    mission_name
    launch_date_local
    rocket {
      rocket_name
    }
  }
  rockets(limit: 5) @frame {
    name
    type
    company
    height {
      meters
    }
    mass {
      kg
    }
  }
}
"""
launches, rockets = pluck.read_graphql(query, url=SpaceX)

Now we have the original launches and a new rockets data-frame:

rockets
name type company height.meters mass.kg
Falcon 1 rocket SpaceX 22.25 30146
Falcon 9 rocket SpaceX 70 549054
Falcon Heavy rocket SpaceX 70 1420788
Starship rocket SpaceX 118 1335000

Lists

When a response includes a list, the data-frame is automatically expanded to include one row per item in the list. This is repeated for every subsequent list in the response.

For example, let's query the first five capsules and which missions they have been used for:

query = """
{
  capsules(limit: 5) @frame {
    id
    type
    status
    missions {
      name
    }
  }
}
"""
capsules, = pluck.read_graphql(query, url=SpaceX)
capsules
id type status missions.name
C105 Dragon 1.1 unknown CRS-3
C101 Dragon 1.0 retired COTS 1
C109 Dragon 1.1 destroyed CRS-7
C110 Dragon 1.1 active CRS-8
C110 Dragon 1.1 active CRS-14
C106 Dragon 1.1 active CRS-4
C106 Dragon 1.1 active CRS-11
C106 Dragon 1.1 active CRS-19

Rather than five rows, we have seven; each row contains a capsule and a mission.

Nested @frame directives

Frames can also be nested and if a nested @frame is within a list, the rows are combined into a single data-frame.

For example, we can pluck the top five cores and their missions:

query = """
{
  cores(limit: 5) @frame {
    id
    status
    missions @frame {
      name
      flight
    }
  }
}
"""
cores, missions = pluck.read_graphql(query, url=SpaceX)

Now we have the cores:

cores
id status missions.name missions.flight
B1015 lost CRS-6 22
B0006 lost CRS-1 9
B1034 lost Inmarsat-5 F4 40
B1016 lost TürkmenÄlem 52°E / MonacoSAT 23
B1025 inactive CRS-9 32
B1025 inactive Falcon Heavy Test Flight 55

And we also have the missions data-frame that has been combined from every item in cores:

missions
name flight
CRS-6 22
CRS-1 9
Inmarsat-5 F4 40
TürkmenÄlem 52°E / MonacoSAT 23
CRS-9 32
Falcon Heavy Test Flight 55

Aliases

Column names can be modified using normal GraphQL aliases.

For example, let's tidy-up the field names in the launches data-frame:

query = """
{
  launches(limit: 5) @frame {
    mission: mission_name
    launch_date: launch_date_local
    rocket {
      name: rocket_name
    }
  }
}
"""
launches, = pluck.read_graphql(query, url=SpaceX)
launches
mission launch_date rocket.name
Thaicom 6 2014-01-06T14:06:00-04:00 Falcon 9
AsiaSat 6 2014-09-07T01:00:00-04:00 Falcon 9
OG-2 Mission 2 2015-12-22T21:29:00-04:00 Falcon 9
FalconSat 2006-03-25T10:30:00+12:00 Falcon 1
CRS-1 2012-10-08T20:35:00-04:00 Falcon 9

Leaf fields

The @frame directive can also be used on leaf fields.

For example, we can extract only the name of the mission from past launches:

query = """
{
  launchesPast(limit: 5) {
    mission: mission_name @frame
  }
}
"""
launches, = pluck.read_graphql(query, url=SpaceX)
launches
mission
Starlink-15 (v1.0)
Sentinel-6 Michael Freilich
Crew-1
GPS III SV04 (Sacagawea)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pluck-graphql-0.1.2.tar.gz (4.3 kB view details)

Uploaded Source

Built Distribution

pluck_graphql-0.1.2-py3-none-any.whl (4.2 kB view details)

Uploaded Python 3

File details

Details for the file pluck-graphql-0.1.2.tar.gz.

File metadata

  • Download URL: pluck-graphql-0.1.2.tar.gz
  • Upload date:
  • Size: 4.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/33.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.63.0 importlib-metadata/4.11.2 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.7

File hashes

Hashes for pluck-graphql-0.1.2.tar.gz
Algorithm Hash digest
SHA256 1c54e92966a5ff3ae75c8ef81961d1484dc76078d73b809e205fbfb8f883494e
MD5 2feeb33d357e214d104956233732e269
BLAKE2b-256 8881faea21859b34db45ecb2cbea4373422d4e1420f5f2286b89711759452a01

See more details on using hashes here.

File details

Details for the file pluck_graphql-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: pluck_graphql-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 4.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/33.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.63.0 importlib-metadata/4.11.2 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.7

File hashes

Hashes for pluck_graphql-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 6cc4161671a6a56b40fbfb8c857a3cb0bd11a60c7ce8059990d983ffa09a4964
MD5 f1577a0e1ddf7c89bfef8666fb284496
BLAKE2b-256 e571a88400cae40ca23ad0ddc471195d8ed16cda4c812e732e802c3664bfc435

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page