Skip to main content

A Python API wrapping services of the Superb Data Kraken (SDK)

Project description

superb-data-klient

superb-data-klient is a simple library to access various services of the Superb Data Kraken platform (SDK). It abstracts accessing services and resources of the SDK with a Python client object managing Authorization, Fetching and Indexing data.

It is primarily intended for use in an Jupyter Hub environment within the platform itself, but can be configured for different environments as well.

Installation and Supported Versions

$ python -m pip install superb-data-klient

superb-data-klient officially supports Python 3.7+.

Usage

Authentication

Before using the api, it is necessary to authenticate against the OIDC provider of the SDK. This is done at instantiation time of the client object. There are two ways to do this.

  1. Using system environment variables. This is the default case which should be used in a Jupyter environment. Simply instantiating the client object is enough in this case.

    import superbdataklient as sdk
    client = sdk.SDKClient()
    

    This however assumes access and refresh tokens are accessible via the enviroment variables SDK_ACCESS_TOKEN, SDK_REFRESH_TOKEN.

  2. Using login credentials

    import superbdataklient as sdk
    sdk.SDKClient(username='hasslethehoff', password='lookingforfreedom')
    

Configuration

By default everything is configured for usage with the default instance of the SDK but comes with settings for various different instances as well.

Setting Environment

import superbdataklient as sdk
client = sdk.SDKClient(env='sdk-dev')
client = sdk.SDKClient(env='sdk')

Overwriting Settings

client = sdk.SDKClient(domain='mydomain.ai', realm='my-realm', client_id='my-client-id', api_version='v13.37')

Examples

Organizations

client.organization_get_all()
client.organization_get_by_id(1337)
client.organization_get_by_name('my-organization')

Spaces

Get all Spaces to a given Organization

organization_id = 1234
client.space_get_all(organization_id)
client.space_get_by_id(organization_id, space_id)
client.space_get_by_name(organization_id, space_name)

Index

List all Indices accessible with given credentials

indices = client.index_get_all()

Get specific document

document = client.index_get_document(index_name, doc_id)

Get all documents of an index:

documents = client.index_get_all_documents("index_name")

Get documents of an index lazily with a generator:

documents = client.index_get_documents("index-name")
for document in documents:
   print(document)

Write document to index

documents = [
   {
      "_id": 123
      "name": "document01",
      "value": "value"
   },
   {
      "_id": 1337
      "name": "document01",
      "value": "value"
   }
]
index_name = "index" 
client.index_documents(documents, index_name)

The (optional) field _id is parsed and used as document id to index to opensearch.

List all indices filtered by organization, space and type accessible with given credentials

client.index_filter_by_space("my-organization", "my-space", "index-type")

use .* instead of my_space to get indices from all spaces in the given organization

index_type is either ANALYSIS or MEASUREMENTS

Create an application index

mapping = {
   ...
}
client.application_index_create("my-application-index", "my-organization", "my-space", mapping)

Delete an application index by name

client.application_index_delete("my-organization_my-space_analysis_my-application-index")

Storage

List files in Storage

files = client.storage_list_blobs(org_name, space_name)

Download files from Storage to local directory

files = [
   'file01.txt'
   'directory/file02.json',
]
client.storage_download_files(organization='my-organization', space='my-space', files=files, local_dir='tmp')

Download files from Storage to local directory depending on a regular expression (regex)

files = [
   'file01.txt',
   'directory/file02.json'
]
client.storage_download_files_with_regex(organization='my-organization', space='my-space', files=files, local_dir='tmp', regex=r'.*json$')

Upload files from local directory to storage. A meta.json file has to exist and will be validated against a schema.

files = [
   'meta.json',
   'file01.txt',
   'file02.txt'
]

client.storage_upload_files_to_loadingzone(organization='my-organization', space='my-space', files= files, local_dir='tmp')

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

superb-data-klient-1.0.0.tar.gz (14.2 kB view hashes)

Uploaded Source

Built Distribution

superb_data_klient-1.0.0-py3-none-any.whl (15.1 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page