Skip to main content

Python client for Aitta HPC ML inference platform

Project description

Python client for Aitta HPC ML inference platform

A Python client library for the Aitta ML inference platform for HPC systems.

IMPORTANT: Note that both the API and the client library are still under heavy development and while we try to keep changes mostly backwards-compatible, breaking changes may happen. Access to Aitta is currently restricted to selected beta users.

Main client API classes

AccessTokenSource

Used by the client to get (and eventually refresh) access tokens

Client

Implements all requests to the Aitta API servers on a low level and is used by all other classes.

Useful members:

  • get_model_list(): Lists all models served by the API that can be accessed with the configured access credentials.

Model

Represents a model and provides methods to perform inference.

Useful members:

  • load(model_id, client): Creates a Model instance given the models id, loading the relevant data from the Aitta API server.

Task

Represents an active inference task and provides methods to query the current status and results.

Useful members:

  • max_progress: The maximum number of progress increments of the task. Progress reporting is not supported by all models, in which case the return value is 1.
  • progress: The current progress towards completion of the task. The increments are arbitrarily chosen by the model. Check max_progress for the maximum number of progress increments of the task. If progress reporting is not supported by the model for this task the returned value is always 0 unless the task is successfully completed, in which case it is 1.
  • results: The outputs of the model for inference request. Raises an IncompleteTaskError if the task was not yet completed. Raises an InferenceError if the task was completed with a failure.

Example usage

The below shows an example for usage of the Aitta API using the Python client library.

For accessing the Aitta API the client will need a way to obtain access tokens, which is implemented in the form of an AccessTokenSource. For the time being, you can generate a static model-specific token at the web frontend by opening the model's page, switching to the "API Key" tab and pressing the "Generate API key" button.

With the token thus obtained, then have to create an instance of StaticAccessTokenSource for use with the client library.

Chat completion with the LumiOpen/Llama-Poro-2-70B-Instruct model

This example shows how to start a conversation with the model using OpenAI’s chat completion feature.

NOTE: It is recommended to specify the max_tokens value to a high enough number in order to avoid responses being cut out.

from aitta_client import Model, Client, StaticAccessTokenSource
import openai

# configure Client instance with API URL and access token
access_token = "token_for_given_model"
token_source = StaticAccessTokenSource(access_token)
aitta_client = Client("https://api-staging-aitta.2.rahtiapp.fi", token_source)

# load the LumiOpen/Llama-Poro-2-70B-Instruct model
model = Model.load("LumiOpen/Llama-Poro-2-70B-Instruct", aitta_client)
print(model.description)

# configure OpenAI client to use the Aitta OpenAI compatibility endpoints
client = openai.OpenAI(api_key=token_source.get_access_token(), base_url=model.openai_api_url)
# perform chat completion with the OpenAI client
chat_completion = client.chat.completions.create(
    messages=[
        {
            "role": "user",
            "content": "Write a novel about bear talking about how to combine HPC with QC."
        }
    ],
    model=model.id,
    max_tokens=8000,
    stream=False  # response streaming is currently not supported by Aitta
)

print(chat_completion.choices[0].message.content)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aitta_client-0.3.2.tar.gz (16.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

aitta_client-0.3.2-py3-none-any.whl (17.7 kB view details)

Uploaded Python 3

File details

Details for the file aitta_client-0.3.2.tar.gz.

File metadata

  • Download URL: aitta_client-0.3.2.tar.gz
  • Upload date:
  • Size: 16.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.13

File hashes

Hashes for aitta_client-0.3.2.tar.gz
Algorithm Hash digest
SHA256 83864465d9b6b85a6b4151ac95da2d111384b614661b9d580e2c24c64c7fa58b
MD5 50eb972a4f86c172b51f6130d23c82f0
BLAKE2b-256 529e85b300fb76b451acb144a6799c089f4e233c3eafd0d01bf4e75c1d012382

See more details on using hashes here.

File details

Details for the file aitta_client-0.3.2-py3-none-any.whl.

File metadata

  • Download URL: aitta_client-0.3.2-py3-none-any.whl
  • Upload date:
  • Size: 17.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.13

File hashes

Hashes for aitta_client-0.3.2-py3-none-any.whl
Algorithm Hash digest
SHA256 e2d63eef5d19dce247e89fb8ea23f6bc6b04c46f088a1faa934af28952c40d69
MD5 82d5d87e7f77b2eab8f80a2ae2122dd4
BLAKE2b-256 f75a68bf328a26bcb0e08b05a3a3dd2b8722ff2ecd7e51666cf601bddf658e4b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page