Skip to main content

Add your description here

Project description

openQA

openQA is an open-source framework designed to automate the testing and evaluation of AI models. It provides a comprehensive architecture for defining audit configurations, running tests, analyzing results, and generating reports.

Traditionally, testing AI models involves complex, subjective evaluations. AI Auditor changes that by introducing a revolutionary approach - leveraging a second Large Language Model (LLM) for distance-based scoring. This innovative framework empowers you to:

  • Automate AI testing: Define clear audit configurations with pre-defined and programmatic inputs/outputs for consistent evaluation.
  • Achieve objective scoring: The second LLM objectively assesses the discrepancy between the AI model's output and the desired outcome, eliminating human bias.
  • Gain deeper insights: Generate detailed reports highlighting identified issues, distance scores, and areas for improvement.

AI Auditor goes beyond basic testing, providing a comprehensive and reliable solution for building trust in your AI models.

Key Features

  • Configurable Audits: Define audits with pre-defined and programmatic inputs/outputs for flexible testing scenarios.
  • Modular Architecture : Leverage separate components for configuration management, test execution, evaluation, and reporting.
  • Distance-Based Scoring : Employ a second LLM to calculate the distance between desired and actual outputs for objective scoring.
  • Detailed Reports : Generate comprehensive reports summarizing audit results, identified discrepancies, and scores.

Architecture:

The core architecture of AI Auditor consists of several interacting components:

  • Audit Config: This component defines the configuration for a specific audit. It includes:
  • Pre-defined inputs: Specific data or prompts to be fed to the AI model under test.
  • Programmatic input generation: Python or similar code to dynamically generate inputs based on specific criteria.
  • Desired outputs: Expected outputs from the AI model for the provided inputs. These can be pre-defined text, data structures, or scoring criteria.
  • Configuration Management: This component manages and stores audit configurations, allowing for easy creation, modification, and version control.
  • Runner: This component executes the AI model under test according to the specified configuration. It provides the defined inputs to the model and captures the generated outputs.
  • LLM Evaluator: This component utilizes a second Large Language Model (LLM) to compare the AI model's outputs with the desired outputs from the configuration. It calculates a distance score based on the closeness of the outputs, indicating potential discrepancies.
  • Reporting: This component generates comprehensive reports summarizing the audit results. It includes:
    • Tested AI model and configuration details.
    • Pre-defined and programmatic inputs used.
    • Desired outputs from the configuration.
    • Actual outputs generated by the AI model under test.
    • Distance score calculated by the LLM Evaluator.
    • Identified discrepancies or areas for improvement.

Component Diagram AI Auditor

Benefits:

  • Standardizes and simplifies AI model testing procedures.
  • Enables objective and consistent evaluation through distance-based scoring.
  • Improves transparency and interpretability of AI model behavior.
  • Generates detailed reports for informed decision-making.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

openqa-0.1.0.tar.gz (34.7 kB view details)

Uploaded Source

Built Distribution

openqa-0.1.0-py3-none-any.whl (14.7 kB view details)

Uploaded Python 3

File details

Details for the file openqa-0.1.0.tar.gz.

File metadata

  • Download URL: openqa-0.1.0.tar.gz
  • Upload date:
  • Size: 34.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.12.3

File hashes

Hashes for openqa-0.1.0.tar.gz
Algorithm Hash digest
SHA256 becbbdba858b51b010514ffd13fe58e9d7af7ada5f168d6c5d7adefb9dc7d056
MD5 bbcf7c5e2d87dae870e57cc607c6ddf0
BLAKE2b-256 91937dc73ea44f41b037289e8f5e8839819e92f3bb1e2cad71c2dc3fbe70e797

See more details on using hashes here.

File details

Details for the file openqa-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: openqa-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 14.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.12.3

File hashes

Hashes for openqa-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2a359cd4de18ca20eab623ff443c20488d3225f4ca2d3c01eea38e0b6c610588
MD5 7cbfe0adf47f9bf9a980f4e9b2f5a2d0
BLAKE2b-256 e4fd492c461fa6378890aa395528dd5dbd914512279992ca0d43c2dbee6a4e01

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page