Data Access Platform client library
Reason this release was yanked:
Version is deprecated.
Project description
Data Access Platform Client Library
Data Access Platform (DAP) acts as a single source of data for analytics at Instructure. It provides efficient access to data collected across various educational products in bulk with high fidelity and low latency, adhering to a canonical data model.
The outgoing interface for DAP is the Query API, which is an HTTP REST service. Users initiate asynchronous queries to retrieve data associated with their account. This client library is a Python wrapper around the DAP API.
Each DAP user acts as a data administrator for the organization they represent. They have full read access to the top-level account and all descendant sub-accounts. For example, in Canvas, the top of the organization hierarchy is uniquely identified by a root account ID, and each data record is associated with a root account ID. A DAP user with Canvas access can query data that are assigned the user's root account ID.
DAP API requires authentication. The client library takes care of authentication behind the scenes provided you have the appropriate API key, and passes the token to each API operation it invokes. Refer to the documentation of Instructure API Gateway Service to learn more about the authentication process.
Under the hood, API users must first acquire a JSON Web Token (JWT) obtained from the authentication endpoint of Instructure API Gateway Service in order to invoke DAP API endpoints, and pass the JWT to all subsequent calls to DAP API.
Major features
- List the name of tables available for querying
- Download the JSON schema of a selected table
- Fetch a full table snapshot
- Fetch incremental updates since a specific point in time
- Save data in several output formats: CSV, TSV, JSON, Parquet
- Download output to a local directory
Getting started
Accessing DAP API requires a URL to an endpoint, and an API key. Once obtained, they can be set as environment variables (recommended), or passed as command-line arguments:
Use environment variables for authentication
First, configure the environment with what you have in your setup instructions:
export DAP_API_URL=https://api-gateway.instructure.com
export DAP_API_KEY=aCBd3V...U1aaaa
With environment variables set, you can issue dap
commands directly:
dap incremental --namespace canvas --table accounts --since 2022-07-13T09:30:00+02:00
Use command-line for authentication
Unless you set environment variables, you need to pass endpoint URL and API key to the dap
command explicitly:
dap --base-url https://api-gateway.instructure.com --api-key aCBd3V...U1aaaa incremental --namespace canvas --table accounts --since 2022-07-13T09:30:00+02:00
Command-line usage
Invoking the command-line utility with --help
shows usage, required and optional arguments:
dap --help
dap incremental --help
dap snapshot --help
dap list --help
dap schema --help
Common use cases
Chain a snapshot query with an incremental query
When you start using DAP, you will definitely want to download a snapshot for the table(s) you need. In the snapshot query response body, you will find a field called at
, which captures the data lake state at a point in time that the snapshot corresponds to. Copy the timestamp into the since
field of an incremental query request. This will guarantee that you have chained the two queries and will not miss any data.
Note that if a table has not received updates for a while (e.g. user profiles have not changed over the weekend), the value of at
might be well behind current time.
Chain an incremental query with another
To fetch the most recent changes since a previous incremental query, chain the next request to the previous response using since
and until
. The until
of a previous response becomes the since
of the next request. The until
of the next request should typically be omitted, it is automatically populated by DAP API. This allows you to fetch the most recent changes for a table. If a table has not received updates for a while, timestamps you see in the response may lag behind current time.
For example, suppose you submit an incremental query job #82
, and receive a response whose until
is 2021-07-28T19:00
. You can then pass 2021-07-28T19:00
as the value for since
in your next incremental query job #83
. Job #83
would then return 2021-07-28T19:00
as the value of since
(the exact value you submitted), and might return 2021-07-28T21:00
as until
(the latest point in time for which data is available).
If you choose to fill in until
in a request (which is not necessary in most cases), its value must be in the time range DAP has data for. Otherwise, your request is rejected.
Get the list of tables available for querying
The list
command will return all table names from a certain namespace.
Download the latest schema for a table
The schema endpoint returns the latest schema of a table as a JSON Schema document. The schema
command enables you to download the schema of a specified table as a JSON file.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file instructure-dap-client-0.2.3.tar.gz
.
File metadata
- Download URL: instructure-dap-client-0.2.3.tar.gz
- Upload date:
- Size: 19.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c1c3eb4297c3b3102a5ecacedb8d18859146bc9de56e2a31f3ede9ff43d90055 |
|
MD5 | 8730be8484a57e21684976837ca258d2 |
|
BLAKE2b-256 | bc908e0373c0c00a395f8603fab7c04fab88f48afa0a893a097f46e5a70ddc8b |
File details
Details for the file instructure_dap_client-0.2.3-py3-none-any.whl
.
File metadata
- Download URL: instructure_dap_client-0.2.3-py3-none-any.whl
- Upload date:
- Size: 19.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b483a20663bccf34139381006492bb8d32fc74a73ea3a993d7078e9a12b94ae9 |
|
MD5 | a350ab951c9832d14cd0eeafc2176b88 |
|
BLAKE2b-256 | b932d78fc4aac0e1d9a71947490c11568e4e4720471d09fb059f3c99926418a5 |