An open-source ML & LLM observability platform

These details have not been verified by PyPI

Project links

Homepage

Project description

ServeQuery

An open-source framework to evaluate, test and monitor ML and LLM-powered systems.

:bar_chart: What is ServeQuery?

ServeQuery is an open-source Python library to evaluate, test, and monitor ML and LLM systems—from experiments to production.

🔡 Works with tabular and text data.
✨ Supports evals for predictive and generative tasks, from classification to RAG.
📚 100+ built-in metrics from data drift detection to LLM judges.
🛠️ Python interface for custom metrics.
🚦 Both offline evals and live monitoring.
💻 Open architecture: easily export data and integrate with existing tools.

ServeQuery is very modular. You can start with one-off evaluations or host a full monitoring service.

1. Reports and Test Suites

Reports compute and summarize various data, ML and LLM quality evals.

Start with Presets and built-in metrics or customize.
Best for experiments, exploratory analysis and debugging.
View interactive Reports in Python or export as JSON, Python dictionary, HTML, or view in monitoring UI.

Turn any Report into a Test Suite by adding pass/fail conditions.

Best for regression testing, CI/CD checks, or data validation.
Zero setup option: auto-generate test conditions from the reference dataset.
Simple syntax to set test conditions as gt (greater than), lt (less than), etc.

| Reports |

2. Monitoring Dashboard

Monitoring UI service helps visualize metrics and test results over time.

:woman_technologist: Install ServeQuery

To install from PyPI:

pip install servequery

To install ServeQuery using conda installer, run:

conda install -c conda-forge servequery

:arrow_forward: Getting started

Reports

LLM evals

Import the necessary components:

import pandas as pd
from servequery import Report
from servequery import Dataset, DataDefinition
from servequery.descriptors import Sentiment, TextLength, Contains
from servequery.presets import TextEvals

Create a toy dataset with questions and answers.

eval_df = pd.DataFrame([
    ["What is the capital of Japan?", "The capital of Japan is Tokyo."],
    ["Who painted the Mona Lisa?", "Leonardo da Vinci."],
    ["Can you write an essay?", "I'm sorry, but I can't assist with homework."]],
                       columns=["question", "answer"])

Create an ServeQuery Dataset object and add descriptors: row-level evaluators. We'll check for sentiment of each response, its length and whether it contains words indicative of denial.

eval_dataset = Dataset.from_pandas(pd.DataFrame(eval_df),
data_definition=DataDefinition(),
descriptors=[
    Sentiment("answer", alias="Sentiment"),
    TextLength("answer", alias="Length"),
    Contains("answer", items=['sorry', 'apologize'], mode="any", alias="Denials")
])

You can view the dataframe with added scores:

eval_dataset.as_dataframe()

To get a summary Report to see the distribution of scores:

report = Report([
    TextEvals()
])

my_eval = report.run(eval_dataset)
my_eval
# my_eval.json()
# my_eval.dict()

You can also choose other evaluators, including LLM-as-a-judge and configure pass/fail conditions.

Data and ML evals

Import the Report, evaluation Preset and toy tabular dataset.

import pandas as pd
from sklearn import datasets

from servequery import Report
from servequery.presets import DataDriftPreset

iris_data = datasets.load_iris(as_frame=True)
iris_frame = iris_data.frame

Run the Data Drift evaluation preset that will test for shift in column distributions. Take the first 60 rows of the dataframe as "current" data and the following as reference. Get the output in Jupyter notebook:

report = Report([
    DataDriftPreset(method="psi")
],
include_tests="True")
my_eval = report.run(iris_frame.iloc[:60], iris_frame.iloc[60:])
my_eval

You can also save an HTML file. You'll need to open it from the destination folder.

my_eval.save_html("file.html")

To get the output as JSON or Python dictionary:

my_eval.json()
# my_eval.dict()

You can choose other Presets, create Reports from indiviudal Metrics and configure pass/fail conditions.

Monitoring dashboard

Recommended step: create a virtual environment and activate it.

pip install virtualenv
virtualenv venv
source venv/bin/activate

After installing ServeQuery (pip install servequery), run the ServeQuery UI with the demo projects:

servequery ui --demo-projects all

Visit localhost:8000 to access the UI.

🚦 What can you evaluate?

ServeQuery has 100+ built-in evals. You can also add custom ones.

Here are examples of things you can check:


🔡 Text descriptors	📝 LLM outputs
Length, sentiment, toxicity, language, special symbols, regular expression matches, etc.	Semantic similarity, retrieval relevance, summarization quality, etc. with model- and LLM-based evals.
🛢 Data quality	📊 Data distribution drift
Missing values, duplicates, min-max ranges, new categorical values, correlations, etc.	20+ statistical tests and distance metrics to compare shifts in data distribution.
🎯 Classification	📈 Regression
Accuracy, precision, recall, ROC AUC, confusion matrix, bias, etc.	MAE, ME, RMSE, error distribution, error normality, error bias, etc.
🗂 Ranking (inc. RAG)	🛒 Recommendations
NDCG, MAP, MRR, Hit Rate, etc.	Serendipity, novelty, diversity, popularity bias, etc.

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

0.1.0

Jun 23, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

servequery-0.1.0.tar.gz (450.2 kB view details)

Uploaded Jun 23, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

servequery-0.1.0-py3-none-any.whl (651.2 kB view details)

Uploaded Jun 23, 2025 Python 3

File details

Details for the file servequery-0.1.0.tar.gz.

File metadata

Download URL: servequery-0.1.0.tar.gz
Upload date: Jun 23, 2025
Size: 450.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.0

File hashes

Hashes for servequery-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`24c4cb43d88411b8c7c7f1b20cdf9adc31874fec4dca829fb427bd8b804f583f`
MD5	`4db00e471bc017ada9055bfab99a4473`
BLAKE2b-256	`a85c7419d5251835ded62a25ab9f3e5d025d6b479d895db80e5dc61f787bc354`

See more details on using hashes here.

File details

Details for the file servequery-0.1.0-py3-none-any.whl.

File metadata

Download URL: servequery-0.1.0-py3-none-any.whl
Upload date: Jun 23, 2025
Size: 651.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.0

File hashes

Hashes for servequery-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ed59c8af93a54469d4d396c9ee6bd05fb2e0c3cc6c84819cb1731ca2fc936d76`
MD5	`bf058850c55ca149a069e3ca8204f132`
BLAKE2b-256	`105d36d0785b1f109139bb7208493db62c49a590346ba7cd4a470f74d77c8b4a`

See more details on using hashes here.

servequery 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

ServeQuery

:bar_chart: What is ServeQuery?

1. Reports and Test Suites

2. Monitoring Dashboard

:woman_technologist: Install ServeQuery

:arrow_forward: Getting started

Reports

LLM evals

Data and ML evals

Monitoring dashboard

🚦 What can you evaluate?

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes