Skip to main content

Dataset Exchange API Client Library

Project description

DX API Client Library

Welcome to the DX API Client Library! This library provides a convenient Python interface to interact with the DX API, allowing you to manage datasets, installations, and perform various operations with ease.

Table of Contents

Features

  • Authenticate with the DX API using JWT tokens.
  • Manage installations and datasets.
  • Upload and download data to and from datasets.
  • Synchronous and asynchronous support.
  • Context managers for handling authentication scopes.

Installation

You can install the library using pip:

pip install dx-api-client

Note: Replace dx-api-client with the actual package name when it's published.

Prerequisites

  • Python 3.10 or higher.
  • An application ID (app_id) and a corresponding private key in PEM format.
  • DX API access credentials.

Getting Started

Authentication

The library uses JWT tokens for authentication. You need to provide your app_id and the path to your private key file when initializing the client.

Initialization

from dx_api_client import DX

# Initialize the client
dx = DX(app_id='your_app_id', private_key_path='path/to/private_key.pem')

Alternatively, you can set the environment variables DX_CONFIG_APP_ID and DX_CONFIG_PRIVATE_KEY_PATH:

export DX_CONFIG_APP_ID='your_app_id'
export DX_CONFIG_PRIVATE_KEY_PATH='path/to/private_key.pem'

And initialize the client without arguments:

dx = DX()

Usage

Who Am I

Retrieve information about the authenticated user:

user_info = dx.whoami()
print(user_info)

Managing Installations

Listing Installations

installations = dx.get_installations()
for installation in installations:
    print(installation.name)

Accessing an Installation Context

Use the installation context to perform operations related to a specific installation:

# Find an installation by name or ID
installation = dx.installations.find(install_id=1)

# Use the installation context
with dx.installation(installation) as ctx:
    # Perform operations within the context
    datasets = list(ctx.datasets)
    for dataset in datasets:
        print(dataset.name)

Or enter context with a lookup by name:

with dx.installation(install_id=1) as ctx:
    # Perform operations within the context
    datasets = list(ctx.datasets)
    for dataset in datasets:
        print(dataset.name)

Managing Datasets

Listing Datasets

with dx.installation(installation) as ctx:
    for dataset in ctx.datasets:
        print(dataset.name)

Creating a Dataset

from dx_api_client import DatasetSchema, SchemaProperty

# Define the schema
schema = DatasetSchema(
    properties=[
        SchemaProperty(name='id', type='string', required=True),
        SchemaProperty(name='value', type='number', required=True),
    ],
    primary_key=['id']
)

# Create the dataset
with dx.installation(installation) as ctx:
    new_dataset = ctx.datasets.create(
        name='My Dataset',
        description='A test dataset',
        schema=schema.model_dump()  # this can also be defined as a dictionary
    )

Uploading Data to a Dataset

data = [
    {'id': '1', 'value': 10},
    {'id': '2', 'value': 20},
]

with dx.installation(installation) as ctx:
    dataset_ops = ctx.datasets.find(name='My Dataset')
    dataset_ops.load(data, validate_records=True)  # validate_records=True will validate the records against the schema using Pydantic

Retrieving Records from a Dataset

with dx.installation(installation) as ctx:
    dataset_ops = ctx.datasets.find(name='My Dataset')
    records = dataset_ops.records()
    for record in records:
        print(record)

Asynchronous Usage

The library supports asynchronous operations using async/await.

import asyncio

async def main():
    dx = DX()
    async with dx.installation(installation) as ctx:
        async for dataset in ctx.datasets:
            print(dataset.name)

        dataset = await ctx.datasets.find(name='My Dataset')

        data = [
            {'id': '1', 'value': 10},
            {'id': '2', 'value': 20},
        ]

        await dataset.load(data)

        async for record in dataset.records():
            print(record)

asyncio.run(main())

Examples

Example: Loading Data from a File

with dx.installation(installation) as ctx:
    dataset = ctx.datasets.get(id='00000000-0000-0000-0000-000000000000')
    dataset.load_from_file('data.csv')

Example: Uploading Data from a URL

with dx.installation(installation) as ctx:
    dataset = ctx.datasets.find(name='My Dataset')
    dataset.load_from_url('https://example.com/data.csv')

Note: This README assumes that the package name is dx-api-client and that the code is properly packaged and available for installation via pip. Adjust the instructions accordingly based on the actual package name and installation method.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mig_dx_api-0.1.4.tar.gz (54.4 kB view details)

Uploaded Source

Built Distribution

mig_dx_api-0.1.4-py3-none-any.whl (11.8 kB view details)

Uploaded Python 3

File details

Details for the file mig_dx_api-0.1.4.tar.gz.

File metadata

  • Download URL: mig_dx_api-0.1.4.tar.gz
  • Upload date:
  • Size: 54.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.5

File hashes

Hashes for mig_dx_api-0.1.4.tar.gz
Algorithm Hash digest
SHA256 cde131cc42c3bf5ceb5c4a743e8c2ca0a8d3c91333b5f384b81b7bbed2d7cfde
MD5 95d0f5b2e1fb9e0c5e359dd96e9613a4
BLAKE2b-256 ece27dd4606d2947e060afd050a2ba3d0ecc2c7f968f8594379ffbc4d743a6a0

See more details on using hashes here.

File details

Details for the file mig_dx_api-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: mig_dx_api-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 11.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.5

File hashes

Hashes for mig_dx_api-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 a692e6583b2ca97853ea6cc7b534e05adb082de747ed56aee9f53ace3526417b
MD5 6cdce9a12c68eee8ba4a320453e89aa8
BLAKE2b-256 b72c8d92084e3da8ecbe1e373facc6deb884078ff89989db51586792a3f45ec7

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page