A Python API wrapping services of the Superb Data Kraken (SDK)

These details have not been verified by PyPI

Project description

PyPI - License PyPI - Python Version PyPI PyPI - Downloads

superb-data-klient

superb-data-klient offers a streamlined interface to access various services of the Superb Data Kraken platform (SDK). With the library, you can effortlessly fetch and index data, manage indices, spaces and organizations on the SDK.

Designed primarily for a Jupyter Hub environment within the platform, it's versatile enough to be set up in other environments too.

Installation and Supported Versions

$ python -m pip install superb-data-klient

Usage

Authentication

To begin, authenticate against the SDK's OIDC provider. This is achieved when instantiating the client object:

System Environment Variables (recommended for Jupyter environments):
```
import superbdataklient as sdk
client = sdk.SDKClient()
```
This approach leverages environment variables SDK_ACCESS_TOKEN and SDK_REFRESH_TOKEN.

Login Credentials:

import superbdataklient as sdk
sdk.SDKClient(username='hasslethehoff', password='lookingforfreedom')

Authentication Code Flow:

If none of the above mentioned authentication methods fit, authentication is fulfilled via code-flow.

CAUTION Beware that this method only works in a browser-environment.

NOTE: If your user account was linked from an external identity provider, your account in the SDK identity provider (Keycloak) does not have a password by default. To enable login via basic authentication, you need to set a password through self-service first.

Follow these steps to set your password:

Go to the self-service portal for your environment:
- https://{domain}/auth/realms/{realm}/account/.
- e.g. https://app.sdk-cloud.de/auth/realms/efs-sdk/account/.
Set a password for your account.
Once the password is set, you can log in using basic authentication (option 2).

Configuration

While the default settings cater to the standard SDK instance, configurations for various other instances are also available.

Setting Environment

import superbdataklient as sdk
client = sdk.SDKClient(env='sdk-dev')
client = sdk.SDKClient(env='sdk')

Overwriting Settings

client = sdk.SDKClient(domain='mydomain.ai', realm='my-realm', client_id='my-client-id', api_version='v13.37')

Proxy

To use the SDK Client behind a company proxy a user might add the following config parameters to the constructor.
NOTE: The environment Variables "http_proxy" and "https_proxy" will overwrite the settings in the SDKClient. So remove them before configuring the SDKClient.

client = SDKClient(username='hasslethehoff', 
                   password='lookingforfreedom', 
                   proxy_http="http://proxy.example.com:8080", 
                   proxy_https="https://proxy.example.com:8080", 
                   proxy_user="proxyusername", 
                   proxy_pass="proxyuserpassword")

Logging

Our flexible logging-functionality allows you to pass a user-defined logger. This makes it easier to integrate the log output of the class into an existing logging framework. The logger can be passed as an argument during the initialization of the SDKClient instance. If this is the case, log messages are automatically forwarded to this logger in the various methods - otherwise logging will be printed to stdout / stderr.

import logging
from superbdataklient import SDKClient

# Logger konfigurieren
my_logger = logging.getLogger('sdk_logger')
my_logger.setLevel(logging.DEBUG)
console_handler = logging.StreamHandler()
console_handler.setLevel(logging.DEBUG)
formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
console_handler.setFormatter(formatter)
my_logger.addHandler(console_handler)

# Logger an SDKClient übergeben
client = SDKClient(logger=my_logger)

Examples

Organizations

Get details of all organizations, or retrieve by ID or name:

client.organization_get_all()
client.organization_get_by_id(1337)
client.organization_get_by_name('my-organization')

Spaces

To retrieve spaces related to an organization:

organization_id = 1234
client.space_get_all(organization_id)
client.space_get_by_id(organization_id, space_id)
client.space_get_by_name(organization_id, space)

Index

Retrieve a specific document:

document = client.index_get_document(index_name, doc_id)

Fetch all documents within an index:

documents = client.index_get_all_documents("index_name")

Iterate through documents using a generator:

documents = client.index_get_documents("index-name")
for document in documents:
   print(document)

Index multiple documents:

documents = [
   {"_id": 123, "name": "document01", "value": "value"},
   {"_id": 1337, "name": "document02", "value": "value"}
]
index_name = "index"
client.index_documents(documents, index_name)

Note: The optional _id field is used as the document ID for indexing in OpenSearch.

Filter indices by organization, space, and type:

client.index_filter_by_space("my-organization", "my-space", "index-type")

For all spaces in an organization, use * instead of a space name. Available index_type values are ANALYSIS or MEASUREMENTS.

Create an application index:

mapping = {
   ...
}
client.application_index_create("my-application-index", "my-organization", "my-space", mapping)

Remove an application index by its name:

client.application_index_delete("my-organization_my-space_analysis_my-application-index")

Storage

List files in Storage:

files = client.storage_list_blobs("my-organization", "space")

Download specific files from Storage:

files = ['file01.txt', 'directory/file02.json']
client.storage_download_files(organization='my-organization', space='my-space', files=files, local_dir='tmp')

Use regex patterns for file downloads:

files = ['file01.txt', 'directory/file02.json']
client.storage_download_files_with_regex(organization='my-organization', space='my-space', files=files, local_dir='tmp', regex=r'.*json$')

Upload files from a local directory. Ensure the presence of a valid meta.json if the metadataGenerate property on the space is not set to true:

files = ['meta.json', 'file01.txt', 'file02.txt']
client.storage_upload_files(organization='my-organization', space='my-space', files=files, local_dir='tmp')

If you want to monitor the status of the upload, you can pass a progress_callback function with the following function-signature:

def progress_callback(uploaded: int, total: int) -> None:

where:

uploaded: The number of bytes that have been uploaded so far.
total: The total size of the file in bytes.

def progress_callback(uploaded, total):
    # do something to update the progress-bar

files = ['meta.json', 'file01.txt', 'file02.txt']
client.storage_upload_files(organization='my-organization', space='my-space', files=files, local_dir='tmp', progress_callback=progress_callback)

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

1.5.0

Nov 4, 2024

1.4.0

Oct 10, 2024

1.3.0

Jul 24, 2024

1.2.3

Jul 8, 2024

1.2.2

Jul 4, 2024

1.2.1

Jun 28, 2024

1.2.0

Mar 6, 2024

1.1.0

Jan 18, 2024

1.0.2

Nov 7, 2023

1.0.1

Aug 17, 2023

1.0.0

Jun 19, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

superb_data_klient-1.5.0.tar.gz (30.5 kB view details)

Uploaded Nov 4, 2024 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

superb_data_klient-1.5.0-py3-none-any.whl (25.0 kB view details)

Uploaded Nov 4, 2024 Python 3

File details

Details for the file superb_data_klient-1.5.0.tar.gz.

File metadata

Download URL: superb_data_klient-1.5.0.tar.gz
Upload date: Nov 4, 2024
Size: 30.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.1 CPython/3.9.11

File hashes

Hashes for superb_data_klient-1.5.0.tar.gz
Algorithm	Hash digest
SHA256	`b8ae343a7fdde94e4577c48fc95e8e88170adc18352868bed80d01dad59ec8a1`
MD5	`408a7b3ac38af6b64fd5c511ea514f73`
BLAKE2b-256	`c5282dda5441cb5ef626d62e443a5d38dc0b91f6043541722678f8b074b1ecb6`

See more details on using hashes here.

File details

Details for the file superb_data_klient-1.5.0-py3-none-any.whl.

File metadata

Download URL: superb_data_klient-1.5.0-py3-none-any.whl
Upload date: Nov 4, 2024
Size: 25.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.1 CPython/3.9.11

File hashes

Hashes for superb_data_klient-1.5.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`1b650aa264a45253fc8e7558a9bd34c46fbc8a306dae94f8b81c6d99d00949a4`
MD5	`cf6aad788e659242f549a0418a3e39b0`
BLAKE2b-256	`0e11fed33f0b84fe443e8e7b4bf066511e5fbbd563fce8f7da04e6ed12c043bf`

See more details on using hashes here.

superb-data-klient 1.5.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

superb-data-klient

Installation and Supported Versions

Usage

Authentication

Configuration

Setting Environment

Overwriting Settings

Proxy

Logging

Examples

Organizations

Spaces

Index

Storage

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes