A modular framework for evaluating and verifying agentic LLM outputs.

These details have not been verified by PyPI

License
- OSI Approved :: Apache Software License
Operating System
- OS Independent
Programming Language
- Python :: 3

Project description

Hyperplane Eval

Hyperplane Eval is a Python-based testing framework that helps you figure out exactly when and where your AI agents break. Instead of writing manual test cases, you give Hyperplane a target function and a set of rules, and it systematically generates edge-cases to map out your agent's "Safe Polytope" — the operational volume where your agent is reliable.

🚀 How It Works: Breadth-First Evaluation

Testing an AI agent is hard because the potential input space is infinite. Hyperplane solves this by breaking down inputs into "dimensions" of complexity (e.g., Urgency, Ambiguity, Formatting).

Instead of randomly guessing inputs, Hyperplane uses a breadth-first evaluation approach:

Dimension Extraction: It automatically extracts relevant dimensions based on the rules you want to test.
Grid Generation: It generates a uniform grid of test scenarios across these dimensions (using Sobol sequences for perfectly even distribution).
Input Synthesis: It uses a strong LLM to generate realistic user inputs that match those specific dimension coordinates.
Evaluation: It executes your local agent code with the generated inputs, and evaluates the output against your rules using a Chain-of-Thought (CoT) judge.

By doing this breadth-first scan across multiple dimensions simultaneously, Hyperplane creates a mathematical map of your agent's reliability and calculates its "Reliability Coverage" as a clear, comparable percentage.

🚦 CLI Integration

Hyperplane is incredibly easy to use. You don't need to write any complex evaluation scripts or boilerplate code; everything is handled through an interactive CLI.

Setup & Installation

Install the framework via pip:

pip install hyperplane-eval

Running the CLI

Run the interactive CLI directly in your terminal from inside your project directory:

hyperplane

The wizard will immediately guide you through the evaluation setup:

Target Selection: It will automatically scan your local Python files and let you pick the function that acts as your agent's entry point.
Rule Definition: You define the rules your agent must follow in plain English (e.g., "Never offer a refund over $50").
Configuration: You configure the depth (how many points to test) and breadth (how many dimensions to extract).
Execution: The framework will spin up workers, generate the test space, execute your local code, and render a real-time terminal dashboard.

Once complete, Hyperplane generates an interactive HTML report showing exactly which dimensions cause your agent to fail, allowing you to easily identify blind spots in your system prompts.

🛠 Technology Stack

Language: Python 3.10+
Data Modeling: pydantic
Math/Geometry: numpy, scipy (Sobol sequences, ConvexHull analysis)
LLM Integration: litellm for universal API connectivity (OpenAI, Gemini, Anthropic, or any local vLLM).

📄 License

This project is licensed under the Apache License, Version 2.0. See the LICENSE file for more information.

Project details

These details have not been verified by PyPI

License
- OSI Approved :: Apache Software License
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

0.1.14

Jun 17, 2026

0.1.13

Jun 17, 2026

0.1.12

Jun 17, 2026

0.1.11

Jun 16, 2026

0.1.10

Jun 16, 2026

0.1.9

Jun 16, 2026

0.1.8

Jun 16, 2026

0.1.7

Jun 16, 2026

0.1.6

Jun 16, 2026

This version

0.1.5

Jun 15, 2026

0.1.4

Jun 15, 2026

0.1.3

Jun 15, 2026

0.1.2

Jun 15, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hyperplane_eval-0.1.5.tar.gz (67.1 kB view details)

Uploaded Jun 15, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

hyperplane_eval-0.1.5-py3-none-any.whl (80.9 kB view details)

Uploaded Jun 15, 2026 Python 3

File details

Details for the file hyperplane_eval-0.1.5.tar.gz.

File metadata

Download URL: hyperplane_eval-0.1.5.tar.gz
Upload date: Jun 15, 2026
Size: 67.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for hyperplane_eval-0.1.5.tar.gz
Algorithm	Hash digest
SHA256	`2f6d5c9348994d4b3b738eb84056434467574e2ada72457ca4a93c40cbf13648`
MD5	`682f0ce4a507629335be07699214cb70`
BLAKE2b-256	`6d8d0c49027098d405a521e99a0d5bf807b33ea2a904a17deb97bb66e6a259d6`

See more details on using hashes here.

File details

Details for the file hyperplane_eval-0.1.5-py3-none-any.whl.

File metadata

Download URL: hyperplane_eval-0.1.5-py3-none-any.whl
Upload date: Jun 15, 2026
Size: 80.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for hyperplane_eval-0.1.5-py3-none-any.whl
Algorithm	Hash digest
SHA256	`e87a691758ad428cf267f47491acbd23cae1b7e803602c1cc03c8764e0a68e47`
MD5	`3715d2cd4ca1fd793a5536397a52cc59`
BLAKE2b-256	`9b9af7a9cdef318b38fc2bd59608c1ff906e60dceddf0d8360049c1ceebed7a7`

See more details on using hashes here.

hyperplane-eval 0.1.5

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

Hyperplane Eval

🚀 How It Works: Breadth-First Evaluation

🚦 CLI Integration

Setup & Installation

Running the CLI

🛠 Technology Stack

📄 License

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes