The giskard_hub library allows you to interact with the Giskard Hub, a platform that centralizes the validation process of LLM applications, empowering product teams to ensure all functional, business & legal requirements are met, and keeping them in close contact with the development team to avoid delayed deployment timelines.

Project description

Giskard Hub Client Library

The Giskard Hub is a platform that centralizes the validation process of LLM applications, empowering product teams to ensure all functional, business & legal requirements are met, and keeping them in close contact with the development team to avoid delayed deployment timelines.

The giskard_hub Python library provides a simple way for developers and data scientists to manage and evaluate LLM applications in their development workflow during the prototyping phase and for continuous integration testing.

Read the quickstart guide to get up and running with the giskard_hub library. You will learn how to execute local evaluations from a notebook, script or CLI, and synchronize them to the Giskard Hub platform.

Access the full docs at: https://docs.giskard.ai/

Install the client library

The library is compatible with Python 3.10 to 3.13.

pip install giskard-hub

Create a project and run an evaluation

You can now use the client to interact with the Hub. You will be able to control the Hub programmatically, independently of the UI. Let's start by initializing a client instance:

from giskard_hub import HubClient

hub = HubClient()

You can provide the API key and Hub URL as arguments. Head over to your Giskard Hub instance and click on the user icon in the top right corner. You will find your personal API key, click on the button to copy it.

hub = HubClient(
    api_key="YOUR_GSK_API_KEY",
    hub_url="THE_GSK_HUB_URL",
)

You can now use the hub client to control the Giskard Hub! Let's start by creating a fresh project.

Create a project

project = hub.projects.create(
    name="My first project",
    description="This is a test project to get started with the Giskard Hub client library",
)

That's it! You have created a project. You will now see it in the Hub UI project selector.

Tip

If you have an already existing project, you can easily retrieve it. Either use hub.projects.list() to get a list of all projects, or use hub.projects.retrieve("YOUR_PROJECT_ID") to get a specific project.

Import a dataset

Let's now create a dataset and add a chat test case example.

# Let's create a dataset
dataset = hub.datasets.create(
    project_id=project.id,
    name="My first dataset",
    description="This is a test dataset",
)

We can now add a chat test case example to the dataset. This will be used for the model evaluation.

# Add a chat test case example
hub.chat_test_cases.create(
    dataset_id=dataset.id,
    messages=[
        dict(role="user", content="What is the capital of France?"),
        dict(role="assistant", content="Paris"),
        dict(role="user", content="What is the capital of Germany?"),
    ],
    demo_output=dict(
        role="assistant",
        content="I don't know that!",
        metadata=dict(
            response_time=random.random(),
            test_metadata="No matter which kind of metadata",
        ),
    ),
    checks=[
        dict(identifier="correctness", params={"reference": "Berlin"}),
        dict(identifier="conformity", params={"rules": ["The agent should always provide short and concise answers."]}),
    ]
)

These are the attributes you can set for a chat test case (the only required attribute is messages):

messages: A list of messages in the chat. Each message is a dictionary with the following keys:
- role: The role of the message, either "user" or "assistant".
- content: The content of the message.
demo_output: A demonstration of a (possibly wrong) output from the model with an optional metadata. This is just for demonstration purposes.
- checks: A list of checks that the chat test case should pass. This is used for evaluation. Each check is a dictionary with the following keys:
- identifier: The identifier of the check. If it's a built-in check, you will also need to provide the params dictionary. The built-in checks are:
  - correctness: The output of the model should match the reference.
  - conformity: The chat should follow a set of rules.
  - groundedness: The output of the model should be grounded in the conversation.
  - string_match: The output of the model should contain a specific string (keyword or sentence).
  - metadata: The metadata output of the model should match a list of JSON path rules.
  - semantic_similarity: The output of the model should be semantically similar to the reference.
- params: A dictionary of parameters for built-in checks. The parameters depend on the check type:
  - For the correctness check, the parameter is reference (type: str), which is the expected output.
  - For the conformity check, the parameter is rules (type: list[str]), which is a list of rules that the conversation should follow.
  - For the groundedness check, the parameter is context (type: str), which is the context in which the model should ground its output.
  - For the string_match check, the parameter is keyword (type: str), which is the string that the model's output should contain.
  - For the metadata check, the parameter is json_path_rules (type: list[dict]), which is a list of dictionaries with the following keys:
    - json_path: The JSON path to the value that the model's output should contain.
    - expected_value: The expected value at the JSON path.
    - expected_value_type: The expected type of the value at the JSON path, one of string, number, boolean.
  - For the semantic_similarity check, the parameters are reference (type: str) and threshold (type: float), where reference is the expected output and threshold is the similarity score below which the check will fail.

You can add as many chat test cases as you want to the dataset.

Again, you'll find your newly created dataset in the Hub UI.

Configure a model/agent

Before running our first evaluation, we'll need to set up a model. You'll need an API endpoint ready to serve the model. Then, you can configure the model API in the Hub:

model = hub.models.create(
    project_id=project.id,
    name="My Bot",
    description="A chatbot for demo purposes",
    url="https://my-model-endpoint.example.com/bot_v1",
    supported_languages=["en", "fr"],
    # if your model endpoint needs special headers:
    headers={"X-API-Key": "MY_TOKEN"},
)

We can test that everything is working well by running a chat with the model:

response = model.chat(
    messages=[
        dict(role="user", content="What is the capital of France?"),
        dict(role="assistant", content="Paris"),
        dict(role="user", content="What is the capital of Germany?"),
    ],
)

print(response)

If all is working well, this will return something like

ModelOutput(
    message=ChatMessage(
        role='assistant',
        content='The capital of Germany is Berlin.'
    ),
    metadata={}
)

Run a remote evaluation

We can now launch a remote evaluation of our model!

eval_run = client.evaluate(
    model=model,
    dataset=dataset,
    name="test-run",  # optional
)

The evaluation will run asynchronously on the Hub. To retrieve the results once the run is complete, you can use the following:

# This will block until the evaluation status is "finished"
eval_run.wait_for_completion()

# Print the evaluation metrics
eval_run.print_metrics()

Tip

You can directly pass IDs to the evaluate function, e.g. model=model_id and dataset=dataset_id, without having to retrieve the objects first.

Project details

Release history Release notifications | RSS feed

3.1.0

Apr 9, 2026

3.0.1

Apr 1, 2026

3.0.0

Apr 1, 2026

3.0.0b9 pre-release

Mar 31, 2026

3.0.0b8 pre-release

Mar 30, 2026

3.0.0b7 pre-release

Mar 19, 2026

3.0.0b6 pre-release

Mar 16, 2026

3.0.0b5 pre-release

Mar 15, 2026

3.0.0b4 pre-release

Mar 12, 2026

3.0.0b3 pre-release

Mar 7, 2026

3.0.0b2 pre-release

Feb 16, 2026

3.0.0b1 pre-release

Feb 5, 2026

2.1.0

Oct 30, 2025

2.0.2

Oct 6, 2025

2.0.1

Oct 1, 2025

This version

2.0.0

Sep 23, 2025

1.2.7

Sep 11, 2025

1.2.6

Sep 3, 2025

1.2.5

Jul 11, 2025

1.2.4

Jun 18, 2025

1.2.3

May 27, 2025

1.2.2

Apr 23, 2025

1.2.1

Apr 9, 2025

1.2.0

Mar 4, 2025

1.1.1

Feb 18, 2025

1.1.0

Sep 13, 2024

1.0.0

Jun 5, 2024

0.1.3

May 28, 2024

0.1.2

May 27, 2024

0.1.1

May 27, 2024

0.1.0

May 27, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

giskard_hub-2.0.0.tar.gz (10.3 MB view details)

Uploaded Sep 23, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

giskard_hub-2.0.0-py3-none-any.whl (35.9 kB view details)

Uploaded Sep 23, 2025 Python 3

File details

Details for the file giskard_hub-2.0.0.tar.gz.

File metadata

Download URL: giskard_hub-2.0.0.tar.gz
Upload date: Sep 23, 2025
Size: 10.3 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.8.20

File hashes

Hashes for giskard_hub-2.0.0.tar.gz
Algorithm	Hash digest
SHA256	`bde782b9791209d952bdebf8645e8914918e65c555db0c7a9da9ba598d40dc16`
MD5	`c1304acf6c76868d8af62eedf28ec483`
BLAKE2b-256	`d04984bb1dd6548f1dc9d79fd10264ee5cfa95f45416ce1a6a56f2f4f366dfc3`

See more details on using hashes here.

File details

Details for the file giskard_hub-2.0.0-py3-none-any.whl.

File metadata

Download URL: giskard_hub-2.0.0-py3-none-any.whl
Upload date: Sep 23, 2025
Size: 35.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.8.20

File hashes

Hashes for giskard_hub-2.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`6dcd723fe3ca59fd5976791893035200c4ba8c932999189c904d6173c89d2738`
MD5	`3f2cb1395f48883806cfc7b4b40a0944`
BLAKE2b-256	`f197995a97325ec9c5fd404573be7f5bab06ae988b3b3332f4b831f779fd77c9`

See more details on using hashes here.

giskard-hub 2.0.0

Navigation

Verified details

Owner

Meta

Unverified details

Project links

Meta

Project description

Giskard Hub Client Library

Install the client library

Create a project and run an evaluation

Create a project

Import a dataset

Configure a model/agent

Run a remote evaluation

Project details

Verified details

Owner

Meta

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes