Anti-Hallucination & Token Optimization library for Groq and Gemini APIs
Project description
🛡️ Hallutok
Anti-Hallucination & Token Optimization for Groq and Gemini APIs
Hallutok solves two real problems that kill your API quota:
| Problem | Hallutok's Solution |
|---|---|
| Long prompts burning through tokens | TokenOptimizer compresses prompts before sending |
| LLM making up facts / hedging | HallucinationValidator scores and flags sketchy responses |
✨ Features
- Token Optimization — whitespace cleanup, filler-phrase compression, deduplication, smart truncation, in-memory caching
- Anti-Hallucination — detects hedging language, ungrounded claims, numeric anomalies, contradictions
- Groq + Gemini — works with both APIs via thin, swappable provider adapters
- Zero hard dependencies — core library is pure Python; providers are optional extras
- Savings reporting — see exactly how many tokens you saved per call
📦 Installation
# With Groq support
pip install hallutok[groq]
# With Gemini support
pip install hallutok[gemini]
# Both
pip install hallutok[all]
🚀 Quick Start
Using Groq
from hallutok import HallutokClient
# Factory shortcut
client = HallutokClient.with_groq(
api_key="gsk_your_groq_key",
model="llama3-8b-8192", # optional, this is the default
temperature=0.3, # lower = more factual
)
result = client.chat(
"Please note that I would like you to explain in order to help me "
"understand what black holes are and how they work. Can you please "
"provide a detailed explanation? It is important to note that I am "
"a beginner."
)
print(result.response)
print(result.token_report)
# {'tokens_before': 48, 'tokens_after': 19, 'tokens_saved': 29, 'percent_saved': 60.4}
if result.validation.is_likely_hallucination:
print("⚠️ Flags:", result.validation.flags)
Using Gemini
from hallutok import HallutokClient
client = HallutokClient.with_gemini(
api_key="AIza_your_gemini_key",
model="gemini-1.5-flash",
)
result = client.chat("Explain quantum entanglement to a 10-year-old.")
print(result.response)
Using providers directly
from hallutok import HallutokClient
from hallutok.providers import GroqProvider, GeminiProvider
# Swap providers without changing anything else
provider = GroqProvider(api_key="gsk_...", model="mixtral-8x7b-32768")
# provider = GeminiProvider(api_key="AIza_...", model="gemini-1.5-pro")
client = HallutokClient(
provider=provider,
optimize_tokens=True, # default: True
validate_responses=True, # default: True
max_prompt_tokens=512, # hard cap on prompt size
temperature=0.4,
max_response_tokens=1024,
system_prompt="You are a factual assistant. Cite sources when possible.",
)
result = client.chat("What causes inflation?")
🔧 Components
TokenOptimizer
Use standalone if you only need compression:
from hallutok.optimizer import TokenOptimizer
opt = TokenOptimizer()
raw = """
Please note that I would like you to, in order to be helpful,
can you please explain, it is important to note that, machine learning
is a subset of AI. Machine learning is a subset of AI. Machine learning is a subset of AI.
"""
compressed = opt.optimize(raw, max_tokens=100)
print(compressed)
report = opt.savings_report(raw, compressed)
# {'tokens_before': 54, 'tokens_after': 12, 'tokens_saved': 42, 'percent_saved': 77.8}
What the optimizer does, in order:
- Normalize whitespace (collapse spaces, trim blank lines)
- Strip boilerplate ("Please note that", "I would like you to", etc.)
- Deduplicate repeated sentences
- Replace verbose phrases ("in order to" → "to", "due to the fact that" → "because", …)
- Truncate to
max_tokensat a sentence boundary
HallucinationValidator
Use standalone to audit any text:
from hallutok.antihallucination import HallucinationValidator
validator = HallucinationValidator()
response = "I think maybe studies show that eating chocolate probably cures cancer."
result = validator.validate(response)
print(result.confidence_score) # e.g. 0.72
print(result.is_likely_hallucination) # True / False
print(result.flags) # list of issues found
print(result.warnings) # human-readable descriptions
print(result.suggestions) # what to do about it
print(result.cleaned_response) # response + disclaimer if flagged
Detection layers:
| Layer | What it catches |
|---|---|
| Hedging | "I think", "maybe", "perhaps", "I'm not sure", etc. |
| Ungrounded claims | "Studies show…", "Research suggests…" without citations |
| Numeric anomalies | Percentages over 100%, other implausible numbers |
| Contradictions | "always" + "never", "increases" + "decreases" in same text |
💡 Tips to Maximize Token Savings
- Avoid filler openers — "Can you please", "I would like you to", "It is important that"
- Don't repeat yourself — Hallutok deduplicates, but it's faster to not duplicate at all
- Use
max_prompt_tokens— set a hard cap so you never accidentally send a 4k-token prompt - Lower the temperature —
temperature=0.3reduces hallucination risk significantly - Use a system prompt — instruct the model to cite sources and avoid speculation
- Check
token_reportper call — it tells you exactly what was saved
📊 ChatResult Fields
result.response # final (possibly cleaned) text
result.original_prompt # your original input
result.optimized_prompt # what was actually sent to the API
result.token_report # {tokens_before, tokens_after, tokens_saved, percent_saved}
result.validation # ValidationResult object
result.provider # "groq" or "gemini"
result.warnings # list of human-readable warnings
🗺️ Roadmap
- Async support (
achat()) - Streaming responses
- OpenAI / Together AI provider adapters
- Per-call token budget enforcement
- Context window manager for multi-turn conversations
- More hallucination detection strategies (self-consistency, chain-of-thought verification)
📄 License
MIT License — see LICENSE for details.ß
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file hallutok-0.1.0.tar.gz.
File metadata
- Download URL: hallutok-0.1.0.tar.gz
- Upload date:
- Size: 16.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
94fc9bfda1412c6d513cbc51dfaba7cf7963697ef3a6298945bb5e393fe7052d
|
|
| MD5 |
ad915c2b267f6e06612d17bb243d31a8
|
|
| BLAKE2b-256 |
b5e403f5c9ca386ee11d975dd22321d5506bd35c504b09d858189a958ea8dcfb
|
File details
Details for the file hallutok-0.1.0-py3-none-any.whl.
File metadata
- Download URL: hallutok-0.1.0-py3-none-any.whl
- Upload date:
- Size: 15.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1e3b913534dd25b0ae8833fab18fe03f0b4eeede8bf1781997a8843af0069452
|
|
| MD5 |
51e63b4ca8a514aa5921787d7eb5385a
|
|
| BLAKE2b-256 |
a33c6dd6d4431db189065e113c8cbc5432071531bf3e1130b6779fd316f312d6
|