A Python API wrapping services of the Superb Data Kraken (SDK)
Project description
superb-data-klient
superb-data-klient offers a streamlined interface to access various services of the Superb Data Kraken platform (SDK). With the library, you can effortlessly fetch and index data, manage indices, spaces and organizations on the SDK.
Designed primarily for a Jupyter Hub environment within the platform, it's versatile enough to be set up in other environments too.
Installation and Supported Versions
$ python -m pip install superb-data-klient
Usage
Authentication
To begin, authenticate against the SDK's OIDC provider. This is achieved when instantiating the client object:
-
System Environment Variables (recommended for Jupyter environments):
import superbdataklient as sdk client = sdk.SDKClient()
This approach leverages environment variables SDK_ACCESS_TOKEN and SDK_REFRESH_TOKEN.
-
Login Credentials:
import superbdataklient as sdk sdk.SDKClient(username='hasslethehoff', password='lookingforfreedom')
-
Authentication Code Flow:
If none of the above mentioned authentication methods fit, authentication is fulfilled via code-flow.
CAUTION Beware that this method only works in a browser-environment.
Configuration
While the default settings cater to the standard SDK instance, configurations for various other instances are also available.
Setting Environment
import superbdataklient as sdk
client = sdk.SDKClient(env='sdk-dev')
client = sdk.SDKClient(env='sdk')
Overwriting Settings
client = sdk.SDKClient(domain='mydomain.ai', realm='my-realm', client_id='my-client-id', api_version='v13.37')
Proxy
To Use the SDK Client behind a company proxy a user might add the following config parameters to the constructor.
NOTE: The environment Variables "http_proxy" and "https_proxy" will overwrite the settings in the SDKClient.
So remove them before configuring the SDKClient.
client = SDKClient(username='hasslethehoff',
password='lookingforfreedom',
proxy_http="http://proxy.example.com:8080",
proxy_https="https://proxy.example.com:8080",
proxy_user="proxyusername",
proxy_pass="proxyuserpassword")
Examples
Organizations
Get details of all organizations, or retrieve by ID or name:
client.organization_get_all()
client.organization_get_by_id(1337)
client.organization_get_by_name('my-organization')
Spaces
To retrieve spaces related to an organization:
organization_id = 1234
client.space_get_all(organization_id)
client.space_get_by_id(organization_id, space_id)
client.space_get_by_name(organization_id, space)
Index
Retrieve a specific document:
document = client.index_get_document(index_name, doc_id)
Fetch all documents within an index:
documents = client.index_get_all_documents("index_name")
Iterate through documents using a generator:
documents = client.index_get_documents("index-name")
for document in documents:
print(document)
Index multiple documents:
documents = [
{"_id": 123, "name": "document01", "value": "value"},
{"_id": 1337, "name": "document02", "value": "value"}
]
index_name = "index"
client.index_documents(documents, index_name)
Note: The optional _id field is used as the document ID for indexing in OpenSearch.
Filter indices by organization, space, and type:
client.index_filter_by_space("my-organization", "my-space", "index-type")
For all spaces in an organization, use *
instead of a space name. Available index_type values are ANALYSIS or MEASUREMENTS.
Create an application index:
mapping = {
...
}
client.application_index_create("my-application-index", "my-organization", "my-space", mapping)
Remove an application index by its name:
client.application_index_delete("my-organization_my-space_analysis_my-application-index")
Storage
List files in Storage:
files = client.storage_list_blobs("my-organization", "space")
Download specific files from Storage:
files = ['file01.txt', 'directory/file02.json']
client.storage_download_files(organization='my-organization', space='my-space', files=files, local_dir='tmp')
Use regex patterns for file downloads:
files = ['file01.txt', 'directory/file02.json']
client.storage_download_files_with_regex(organization='my-organization', space='my-space', files=files, local_dir='tmp', regex=r'.*json$')
Upload files from a local directory. Ensure the presence of a valid meta.json
if the metadataGenerate
property on the space is not set to true
:
files = ['meta.json', 'file01.txt', 'file02.txt']
client.storage_upload_files_to_loadingzone(organization='my-organization', space='my-space', files= files, local_dir='tmp')
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for superb_data_klient-1.2.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 795606ee12b6a85c3ddd611faeff911d6de5c9b6b5f53b88dd39297e3af03a38 |
|
MD5 | 540acff9bbf4abf6964b2d6cef31561c |
|
BLAKE2b-256 | cce13679e1ff214e6402b48c418139187d4b8771ae9011eda70ffcdc6fd0e141 |