Open-source AI evaluation toolkit — hallucination detection, safety, industry-specific evals

These details have not been verified by PyPI

Project links

Project description

syncreus-eval

Open-source AI evaluation toolkit for hallucination detection, safety scanning, bias analysis, and industry-specific evals. Runs locally without the Syncreus platform.

Installation

# Core (LLM-as-judge evaluators via Gemini)
pip install syncreus-eval

# With optional extras
pip install syncreus-eval[accuracy]          # fastembed for semantic similarity
pip install syncreus-eval[safety]            # Presidio PII scanning
pip install syncreus-eval[prompt-injection]  # LLM Guard injection detection
pip install syncreus-eval[upload]            # Upload results to Syncreus platform
pip install syncreus-eval[all]               # Everything

Quick Start

from syncreus_eval import evaluate, EvalType

# Hallucination detection
result = evaluate(
    EvalType.HALLUCINATION,
    ai_input="The Eiffel Tower is in Paris, France. It was built in 1889.",
    ai_output="The Eiffel Tower is in Paris, France. It was built in 1889 and is 300 meters tall.",
    gemini_key="your-gemini-api-key",
)
print(result.passed)    # True/False/None
print(result.details)   # Claim-level verdicts

# Performance tracking (no LLM needed)
result = evaluate(
    EvalType.PERFORMANCE,
    trace={"latency_ms": 150, "token_count_input": 100, "token_count_output": 50},
)
print(result.details)   # {"latency_ms": 150, "total_tokens": 150, ...}

# Run multiple evals at once
results = evaluate(
    [EvalType.HALLUCINATION, EvalType.IDEOLOGY],
    ai_input="Context here",
    ai_output="Response here",
    gemini_key="your-key",
)
for r in results:
    print(f"{r.eval_type.value}: passed={r.passed}")

Evaluation Types

General Purpose

Type	Description	Requires
`HALLUCINATION`	Detects unsupported factual claims	Gemini API key
`ACCURACY`	Golden dataset comparison via semantic similarity	`[accuracy]` extra
`CONSISTENCY`	Pairwise similarity across repeated prompts	`[accuracy]` extra
`PERFORMANCE`	Extracts latency, tokens, cost metrics	Nothing
`AGENT_TASK`	Verifies agent completion claim honesty	Gemini API key
`REGRESSION`	Baseline comparison (platform only)	Syncreus platform

Safety & Compliance

Type	Description	Requires
`SAFETY`	PII/sensitive data detection + content safety	`[safety]` extra
`BIAS`	Demographic parity / EEOC four-fifths rule	Nothing
`IDEOLOGY`	Political neutrality (OMB M-26-04)	Gemini API key
`PROMPT_INJECTION`	Injection attempt detection	`[prompt-injection]` extra

Industry-Specific

Type	Description	Requires
`HEALTHCARE`	Medical accuracy, drug safety, PHI detection	Gemini API key
`LEGAL`	Citation validity, holding fidelity	Gemini API key
`FINANCE`	Regulatory accuracy, numerical precision	Gemini API key
`CODE_ACCURACY`	API existence, function signatures	Gemini API key

API Reference

`evaluate()`

from syncreus_eval import evaluate, EvalType

result = evaluate(
    eval_type=EvalType.HALLUCINATION,  # or a list of types
    ai_input="...",
    ai_output="...",
    gemini_key="...",                  # or set GEMINI_API_KEY env var
    # Accuracy-specific:
    test_cases=[{"input_text": "...", "expected_output": "..."}],
    threshold=0.85,
    # Performance-specific:
    trace={"latency_ms": 100, ...},
    # Agent task-specific:
    verification_result="exit code 0",
    # Bias-specific:
    traces=[{"metadata": {"demographic_group": "A"}, "passed": True}],
    # Consistency-specific:
    outputs=["response1", "response2", "response3"],
    # Safety-specific:
    entity_whitelist=["aspirin"],
    enable_gemini_content_safety=True,
)

Returns an EvalResult (or list of them):

class EvalResult:
    eval_type: EvalType
    passed: bool | None      # True/False/None (None = error or skipped)
    score: float | None       # Numeric score where applicable
    details: dict[str, Any]   # Evaluator-specific details
    error: bool               # Whether an error occurred
    error_message: str | None # Error description

`upload_results()` (optional)

from syncreus_eval import upload_results

upload_results(
    results=result,           # EvalResult or list
    api_key="syn_...",        # Syncreus API key
    endpoint="https://api.syncreus.com",
    trace_id="trace-123",     # optional
)

Requires: pip install syncreus-eval[upload]

Environment Variables

Variable	Description
`GEMINI_API_KEY`	Google Gemini API key for LLM-as-judge evaluators

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.2.0

Mar 30, 2026

This version

0.1.0

Mar 29, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

syncreus_eval-0.1.0.tar.gz (23.2 kB view details)

Uploaded Mar 29, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

syncreus_eval-0.1.0-py3-none-any.whl (33.1 kB view details)

Uploaded Mar 29, 2026 Python 3

File details

Details for the file syncreus_eval-0.1.0.tar.gz.

File metadata

Download URL: syncreus_eval-0.1.0.tar.gz
Upload date: Mar 29, 2026
Size: 23.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for syncreus_eval-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`4b6f838e72478f56362fa60016e1e985aa037cbf89a40066e6bace79685083ba`
MD5	`8a57fe1199e404ee06a9b3563d08b5e6`
BLAKE2b-256	`65d420b067816ee23132856a249f6c07d86b2dc43171df24bf44a8bed9b852b8`

See more details on using hashes here.

File details

Details for the file syncreus_eval-0.1.0-py3-none-any.whl.

File metadata

Download URL: syncreus_eval-0.1.0-py3-none-any.whl
Upload date: Mar 29, 2026
Size: 33.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for syncreus_eval-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`7a10ed3d301bc979d84bed83e718bde0cf47225ba6bded5d35c5ed2a458370c0`
MD5	`7d4c1c099b39b31a99fbcb5fa2434a5e`
BLAKE2b-256	`d0c200e67b7b22fa75b3d69b3facb915b924c72295e8a2e07b9605e8adff1922`

See more details on using hashes here.

syncreus-eval 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

syncreus-eval

Installation

Quick Start

Evaluation Types

General Purpose

Safety & Compliance

Industry-Specific

API Reference

`evaluate()`

`upload_results()` (optional)

Environment Variables

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

syncreus-eval 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

syncreus-eval

Installation

Quick Start

Evaluation Types

General Purpose

Safety & Compliance

Industry-Specific

API Reference

evaluate()

upload_results() (optional)

Environment Variables

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`evaluate()`

`upload_results()` (optional)