Evaluation of Generative AI Models
Project description
GENAI EVALUATION
GenAI Evaluation is a library which contains methods to evaluate differences in Real and Synthetic Data.
Functions
- multivariate_ecdf: Computes joint or multivariate ECDF in contrast to the univariate capabilities provided by packages like statsmodels
- ks_statistic: Calculates the KS Statistic for two multivariate ECDFs
Authors
- Dr. Vincent Granville - Research
- Rajiv Iyer - Development/Maintenance
Installation
The package can be installed with
pip install genai_evaluation
Tests
The test can be run by cloning the repo and running:
pytest tests
In case of any issues running the tests, please run them after installing the package locally:
pip install -e .
Usage
Start by importing the class
from genai_evaluation import multivariate_ecdf, ks_statistic
Assuming we have two pandas dataframes (Real & Synthetic) and only numerical columns, we pass them to the multivariate_ecdf function which returns the computed multivariate ECDFs of both.
query_str, ecdf_real, ecdf_synth = multivariate_ecdf(real_data, synthetic_data, n_nodes = 1000, verbose = True)
We then calculate the multivariate KS Distance between the ECDFs
ks_stat = ks_statistic(ecdf_real, ecdf_synth)
Motivation
The motivation for this package comes from Dr. Vincent Granville's paper Generative AI Technology Break-through: Spectacular Performance of New Synthesizer
If you have any tips or suggestions, please contact us on email.
History
0.1.0 (2023-09-11)
- First release on PyPI.
0.1.1 (2023-09-11)
Corrected
- Function name from compute_ecdf to multivariate_ecdf
0.1.2 (2023-09-11)
Enhanced
- Added a new parameter verbose in multivariate ECDF function
0.1.3 (2023-09-11)
Corrected
- Removed unecessary docstrings from code
0.1.4 (2023-09-11)
Fixed
- Resolved issues with special characters in the column names
0.1.5 (2023-09-11)
Fixed
- Earlier version considered underscore as a special character. That is rectified in this version
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for genai_evaluation-0.1.5-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2c3c2f6a53d73e1fbb32a7937b412676726e092d9563a075085d5e03e29d755d |
|
MD5 | dabd76718f4c9267ae7f335fee2ec62a |
|
BLAKE2b-256 | 37cc66fc6a057d77ab19120e64f63d99e8841fea635c34d38b60239580d29359 |