Skip to main content

A human-friendly framework for testing and evaluating LLMs, RAGs, and chatbots.

Project description

ContextCheck Logo

ContextCheck

A human-friendly framework for testing and evaluating LLMs, RAGs, and chatbots.

ContextCheck is an open-source framework designed to evaluate, test, and validate large language models (LLMs), Retrieval-Augmented Generation (RAG) systems, and chatbots. It provides tools to automatically generate queries, request completions, detect regressions, perform penetration tests, and assess hallucinations, ensuring the robustness and reliability of these systems. ContextCheck is configurable via YAML and can be integrated into continuous integration (CI) pipelines for automated testing.

Table of Contents

Features

  • Simple test scenario definition using human-readable .yaml files
  • Flexible endpoint configuration for OpenAI, HTTP, and more
  • Customizable JSON request/response models
  • Support for variables and Jinja2 templating in YAML files
  • Response validation options, including heuristics, LLM-based judgment, and human labeling
  • Enhanced output formatting with the rich package for clear, readable displays

Installation

For Users

Install the package directly from PyPI using pip:

pip install ccheck

After installation, you can access the ccheck CLI command:

ccheck --help

This will display all available options and help you get started with using ContextCheck.

For Developers

If you wish to contribute to the project or modify it for your own use, you can set up a development environment using Poetry.

  1. Fork your own copy of Addepto/contextcheck on GitHub.
  2. Clone the Repository:
git clone https://github.com/<your_username>/contextcheck.git
cd contextcheck
  1. Ensure you have Poetry installed.
  2. Install Dependencies:
poetry install
  1. Activate the Virtual Environment:
poetry shell
  1. Activate the ccheck CLI command using:
poetry run ccheck --help

Tutorial

Please refer to examples/ folder for the tutorial.

CLI Features

Output Test Results to Console

  • Run a single scenario and output results to the console:
    ccheck --output-type console --filename path/to/file.yaml
    
  • Run multiple scenarios and output results to the console:
    ccheck --output-type console --filename path/to/file.yaml path/to/another_file.yaml
    

Running in CI/CD

To automatically stop the CI/CD process if any tests fail, add the --exit-on-failure flag. Failed test will cause the script to exit with code 1:

ccheck --exit-on-failure --output-type console --folder my_tests

Use env variable OPENAI_API_KEY to be able to run:

  • tests/scenario_openai.yaml
  • tests/scenario_defaults.yaml

Contributing

Contributions are welcomed!

Running Tests

To run tests:

poetry run pytest tests/

To include tests which require calling LLM APIs (currently OpenAI and Ollama), run one of:

poetry run pytest --openai          # includes tests that use OpenAI API
poetry run pytest --ollama          # includes tests that use Ollama API
poetry run pytest --openai --ollama # includes tests that use both OpenAI and Ollama API

Acknowledgments

Made with ❤️ by the Addepto Team

ContextCheck is an extension of the ContextClue product, created by the Addepto team. This project is the result of our team’s dedication, combining innovation and expertise.

Addepto Team:

  • Radoslaw Bodus
  • Bartlomiej Grasza
  • Volodymyr Kepsha
  • Vadym Mariiechko
  • Michal Tarkowski

Like what we’re building? ⭐ Give it a star to support its development!

License

This project is licensed under the MIT License - see the LICENSE.md file for details

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ccheck-0.1.1.tar.gz (25.4 kB view details)

Uploaded Source

Built Distribution

ccheck-0.1.1-py3-none-any.whl (37.6 kB view details)

Uploaded Python 3

File details

Details for the file ccheck-0.1.1.tar.gz.

File metadata

  • Download URL: ccheck-0.1.1.tar.gz
  • Upload date:
  • Size: 25.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.1 CPython/3.11.4 Windows/10

File hashes

Hashes for ccheck-0.1.1.tar.gz
Algorithm Hash digest
SHA256 44bf0c6026ad0649a35c516a39aea92e6b8495af4075c27433bc8e01bb54cacf
MD5 b28a7c2df26a2650a80cc879f2302a0d
BLAKE2b-256 da617a57fa1db5654abda17b201c022d218e93e9bf28a1dbdfd475b5a405af7c

See more details on using hashes here.

File details

Details for the file ccheck-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: ccheck-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 37.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.1 CPython/3.11.4 Windows/10

File hashes

Hashes for ccheck-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 a16e045a040ccdfbe6f376e2ca977d3bb06b64ec1ee02c105c35b94126e6f607
MD5 1999c4adf781438a5c488a8453b479b3
BLAKE2b-256 e9595cdd9a43358f5f8fb032d7d8f754956e557cc836fa40164cb3d2b37d4be8

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page