We help GenAI teams maintain high-accuracy for their Models in production.

These details have not been verified by PyPI

Project description

Future AGI

Company Logo

Welcome to Future AGI - Empowering GenAI Teams with Advanced Performance Management

Overview

Future AGI provides a cutting-edge platform designed to help GenAI teams maintain peak model accuracy in production environments. Our solution is purpose-built, scalable, and delivers results 10x faster than traditional methods.

Key Features

Simplified GenAI Performance Management: Streamline your workflow and focus on developing cutting-edge AI models.
Instant Evaluation: Score outputs without human-in-the-loop or ground truth, increasing QA team efficiency by up to 10x.
Advanced Error Analytics: Gain ready-to-use insights with comprehensive error tagging and segmentation.
Configurable Metrics: Define custom metrics tailored to your specific use case for precise model evaluation.

Quickstart

title: Quickstart

This guide will walk you through setting up an evaluation in Future AGI, allowing you to assess AI models and workflows efficiently. You can run evaluations via the Future AGI platform or using the Python SDK.

Access API Key

To authenticate while running evals, you will need Future AGI's API keys, which you can get access by following below steps:

Go to your Future AGI dashboard
Click on Keys under Developer option from left column
Copy both, API Key and Secret Key

Setup Evaluator

Install the Future AGI Python SDK using below command:

pip install ai-evaluation

Then initialise the Evaluator:

from fi.evals import Evaluator

evaluator = Evaluator(
    fi_api_key="your_api_key",
    fi_secret_key="your_secret_key",
)

We recommend you to set the FI_API_KEY and FI_SECRET_KEY environment variables before using the Evaluator class, instead of passing them as parameters.

Running Your First Eval

This section walks you through the process of running your first evaluation using the Future AGI evaluation framework. To get started, we'll use Tone Evaluation as an example.

a. Using Python SDK

Define the Test Case

Create a test case containing the text input that will be evaluated for tone.

from fi.testcases import TestCase

test_case = TestCase(
    input='''
    Dear Sir, I hope this email finds you well. 
    I look forward to any insights or advice you might have 
    whenever you have a free moment.
    '''
)

You can also directly send the data through a dictionary with valid keys. However, it is recommended to use the TestCase class when working with Future AGI Evaluations.

Configure the Evaluation Template

For Tone Evaluation, we use the Tone Evaluation Template to analyse the sentiment and emotional tone of the input.

from fi.evals.templates import Tone

tone_eval = Tone() # This is the evaluation template to use provided by Future AGI

Click here to read more about all the Evals provided by Future AGI

Run the Evaluation

Execute the evaluation and retrieve the results.

result = evaluator.evaluate(eval_templates=[tone_eval], inputs=[test_case])
tone_result = result.eval_results[0].metrics[0].value

To Evaluate the data on your own evaluation template which you have created, you can use the evaluate function with the eval_templates parameter.

from fi.evals import evaluate

result = evaluate(eval_templates="name-of-your-eval", inputs={
    "input": "your_input_text",
    "output": "your_output_text"
})

print(result.eval_results[0].metrics[0].value)

b. Using Web Interface

Select a Dataset

Before running an evaluation, ensure you have selected a dataset. If no dataset is available, follow the steps to Add Dataset on the Future AGI platform.

Access the Evaluation Panel

Navigate to your dataset.
Click on the Evaluate button in the top-right menu.
This will open the evaluation configuration panel.

Starting a New Evaluation

Click on the Add Evaluation button.
You will be directed to the Evaluation List page. You can either create your own evaluation or select from the available templates built by Future AGI.
Click on one of the available templates.
Write the name of the evaluation and select the required dataset column.

Checkmark on **Error Localization** if you want to localize the errors in the dataset when the datapoint is evaluated and fails the evaluation. - Click on the **Add & Run** button.

Creating a New Evaluation

Future AGI provides a wide range of evaluation templates to choose from. You can create your own evaluation to tailor your needs by following below simple steps:

Click on the Create your own eval button after clicking on the Add Evaluation button.
Write the name of the evaluation, this name will be used to identify the evaluation in the evaluation list. only lower case letters, numbers and underscores are allowed in the name.
Select either Use Future AGI Agents or Use other LLMs

Future AGI Agents are our own proprietary models trained on a vast variety of datasets to perform evaluations. These models vary in capabilities and are suited for different use cases:

TURING_LARGE – Flagship evaluation model that delivers best-in-class accuracy across multimodal inputs (text, images, audio). Recommended when maximal precision outweighs latency constraints.
TURING_SMALL – Compact variant that preserves high evaluation fidelity while lowering computational cost. Supports text and image evaluations.
TURING_FLASH – Latency-optimised version of TURING, providing high-accuracy assessments for text and image inputs with fast response times.
PROTECT – Real-time guardrailing model for safety, policy compliance, and content-risk detection. Offers very low latency on text and audio streams and permits user-defined rule sets.
PROTECT_FLASH – Ultra-fast binary guardrail for text content. Designed for first-pass filtering where millisecond-level turnaround is critical.
In the Rule Prompt, you can write the rules that the evaluation should follow. Use {{}} to create a key (variable), that variable will be used in future when you configure the evaluation.
Choose Output Type As either Pass/Fail or Percentage or Deterministic Choices
- Pass/Fail: The evaluation will return either Pass or Fail.
- Percentage: The evaluation will return a Score between 0 and 100.
- Deterministic Choices: The evaluation will return a categorical choice from the list of choices.
Select the Tags for the evaluation that are suitable to use case.
Write the description of the evaluation that will be used to identify the evaluation in the evaluation list.
Checkmark on Check Internet to power your evaluation with the latest information.
Click on the Create Evaluation button.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

1.0.2

Apr 2, 2026

1.0.1

Mar 9, 2026

1.0.0

Feb 27, 2026

0.2.2

Nov 3, 2025

0.2.1

Oct 9, 2025

0.2.0

Oct 2, 2025

This version

0.1.9

Oct 1, 2025

0.1.8

Sep 9, 2025

0.1.7

Aug 20, 2025

0.1.6

Aug 19, 2025

0.1.5

Jul 24, 2025

0.1.4

Jul 24, 2025

0.1.3

Jul 17, 2025

0.1.2

Jul 14, 2025

0.1.1

Jun 24, 2025

0.1.0

May 30, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ai_evaluation-0.1.9.tar.gz (32.4 kB view details)

Uploaded Oct 1, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

ai_evaluation-0.1.9-py3-none-any.whl (38.0 kB view details)

Uploaded Oct 1, 2025 Python 3

File details

Details for the file ai_evaluation-0.1.9.tar.gz.

File metadata

Download URL: ai_evaluation-0.1.9.tar.gz
Upload date: Oct 1, 2025
Size: 32.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/2.0.1 CPython/3.13.2 Darwin/24.6.0

File hashes

Hashes for ai_evaluation-0.1.9.tar.gz
Algorithm	Hash digest
SHA256	`2f9ee44306cbd24b042794b591ab5ee07209be508dee459c7b42a3b5a36acc55`
MD5	`a89cbc0b1a3f3847bae5ddaa1c15019d`
BLAKE2b-256	`eb9648b1e593c3786a34a5824f0f0c7308216c29c13305bfcb4501c9fea532b2`

See more details on using hashes here.

File details

Details for the file ai_evaluation-0.1.9-py3-none-any.whl.

File metadata

Download URL: ai_evaluation-0.1.9-py3-none-any.whl
Upload date: Oct 1, 2025
Size: 38.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/2.0.1 CPython/3.13.2 Darwin/24.6.0

File hashes

Hashes for ai_evaluation-0.1.9-py3-none-any.whl
Algorithm	Hash digest
SHA256	`77ba1aba4b1ec1bde0e75e3e8269ac22e0728e89da58f6f4f00bdc7ff8debbba`
MD5	`de7913654441d06b2da3b6aeff2a3222`
BLAKE2b-256	`9ac17a3372b27cb344b5f1c40e03376ab05f10934027f97c693f0fc5c3d21b9b`

See more details on using hashes here.

ai-evaluation 0.1.9

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

Future AGI

Overview

Quickstart

title: Quickstart

Access API Key

Setup Evaluator

Running Your First Eval

a. Using Python SDK

b. Using Web Interface

Creating a New Evaluation

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes