Skip to main content

A Python API wrapping services of the Superb Data Kraken (SDK)

Project description

PyPI - License PyPI - Python Version PyPI PyPI - Downloads

superb-data-klient

superb-data-klient offers a streamlined interface to access various services of the Superb Data Kraken platform (SDK). With the library, you can effortlessly fetch and index data, manage indices, spaces and organizations on the SDK.

Designed primarily for a Jupyter Hub environment within the platform, it's versatile enough to be set up in other environments too.

Installation and Supported Versions

$ python -m pip install superb-data-klient

Usage

Authentication

To begin, authenticate against the SDK's OIDC provider. This is achieved when instantiating the client object:

  1. System Environment Variables (recommended for Jupyter environments):

    import superbdataklient as sdk
    client = sdk.SDKClient()
    

    This approach leverages environment variables SDK_ACCESS_TOKEN and SDK_REFRESH_TOKEN.

  2. Login Credentials:

    import superbdataklient as sdk
    sdk.SDKClient(username='hasslethehoff', password='lookingforfreedom')
    
  3. Authentication Code Flow:

    If none of the above mentioned authentication methods fit, authentication is fulfilled via code-flow.

    CAUTION Beware that this method only works in a browser-environment.

Configuration

While the default settings cater to the standard SDK instance, configurations for various other instances are also available.

Setting Environment

import superbdataklient as sdk
client = sdk.SDKClient(env='sdk-dev')
client = sdk.SDKClient(env='sdk')

Overwriting Settings

client = sdk.SDKClient(domain='mydomain.ai', realm='my-realm', client_id='my-client-id', api_version='v13.37')

Proxy

To Use the SDK Client behind a company proxy a user might add the following config parameters to the constructor.
NOTE: The environment Variables "http_proxy" and "https_proxy" will overwrite the settings in the SDKClient. So remove them before configuring the SDKClient.

client = SDKClient(username='hasslethehoff', 
                   password='lookingforfreedom', 
                   proxy_http="http://proxy.example.com:8080", 
                   proxy_https="https://proxy.example.com:8080", 
                   proxy_user="proxyusername", 
                   proxy_pass="proxyuserpassword")

Examples

Organizations

Get details of all organizations, or retrieve by ID or name:

client.organization_get_all()
client.organization_get_by_id(1337)
client.organization_get_by_name('my-organization')

Spaces

To retrieve spaces related to an organization:

organization_id = 1234
client.space_get_all(organization_id)
client.space_get_by_id(organization_id, space_id)
client.space_get_by_name(organization_id, space)

Index

Retrieve a specific document:

document = client.index_get_document(index_name, doc_id)

Fetch all documents within an index:

documents = client.index_get_all_documents("index_name")

Iterate through documents using a generator:

documents = client.index_get_documents("index-name")
for document in documents:
   print(document)

Index multiple documents:

documents = [
   {"_id": 123, "name": "document01", "value": "value"},
   {"_id": 1337, "name": "document02", "value": "value"}
]
index_name = "index"
client.index_documents(documents, index_name)

Note: The optional _id field is used as the document ID for indexing in OpenSearch.

Filter indices by organization, space, and type:

client.index_filter_by_space("my-organization", "my-space", "index-type")

For all spaces in an organization, use * instead of a space name. Available index_type values are ANALYSIS or MEASUREMENTS.

Create an application index:

mapping = {
   ...
}
client.application_index_create("my-application-index", "my-organization", "my-space", mapping)

Remove an application index by its name:

client.application_index_delete("my-organization_my-space_analysis_my-application-index")

Storage

List files in Storage:

files = client.storage_list_blobs("my-organization", "space")

Download specific files from Storage:

files = ['file01.txt', 'directory/file02.json']
client.storage_download_files(organization='my-organization', space='my-space', files=files, local_dir='tmp')

Use regex patterns for file downloads:

files = ['file01.txt', 'directory/file02.json']
client.storage_download_files_with_regex(organization='my-organization', space='my-space', files=files, local_dir='tmp', regex=r'.*json$')

Upload files from a local directory. Ensure the presence of a valid meta.json if the metadataGenerate property on the space is not set to true:

files = ['meta.json', 'file01.txt', 'file02.txt']
client.storage_upload_files_to_loadingzone(organization='my-organization', space='my-space', files= files, local_dir='tmp')

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

superb-data-klient-1.2.0.tar.gz (21.4 kB view hashes)

Uploaded Source

Built Distribution

superb_data_klient-1.2.0-py3-none-any.whl (18.1 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page