Unified platform for self-hosted LLM inference + enterprise safety governance

These details have not been verified by PyPI

Project links

Project description

TurboPrivate AI — Private & Safe Enterprise AI Platform

Run powerful LLMs on your own hardware — 40–60% cheaper than public clouds, with built-in enterprise safety & governance.

Why TurboPrivate AI?

Full data sovereignty — nothing leaves your infrastructure
Dramatic cost reduction — INT4/AWQ quantization + smart routing
Enterprise Safety — powered by Mythos Safe (defensive evaluation, jailbreak protection, audit)
OpenAI compatible — drop-in replacement for your existing applications
One-command deploy — from bare metal to production in minutes

Key Features

TurboQuant Engine — State-of-the-art INT4/AWQ quantization with minimal quality loss
Mythos Safe — Multi-layer defensive safety (pre & post-flight gates)
Private RAG — Secure document ingestion and retrieval
Full-stack observability — Prometheus, Grafana, OpenTelemetry
Enterprise ready — RBAC, audit trail, multi-tenancy, compliance support
Hardware flexibility — RTX 4090, A100/H100, or even CPU-only

Performance (RTX 4090)

Model	Quant	Tokens/sec	VRAM Usage	Cost vs Groq/AWS
Llama 3.1 8B	INT4	110+	~5.8 GB	~8x cheaper
Qwen2.5 32B	INT4	45+	~22 GB	~6x cheaper
Llama 3.1 70B	INT4	18+	~48 GB	~5x cheaper

Quick Start

# 1. Deploy full stack (K8s)
turbo deploy --provider bare-metal --gpu auto

# 2. Serve model
turbo model serve meta-llama/Llama-3.1-8B --quant int4

# 3. Chat
turbo chat

Or use Docker Compose for quick testing:

docker compose up -d                    # dev
# docker compose -f docker-compose.prod.yml up -d  # production (GPU)

Pricing

Tier	Price	Best For	Includes
PoC / Pilot	€15,000 – €35,000	4–8 weeks trial	Deployment, 2 models, training, support
Enterprise License	€65,000 / year	Single cluster, up to 10 users	Full features, unlimited models, SLA 99.5%
Enterprise Plus	€120,000 – €180,000 / year	Multiple clusters, 50+ users	Priority support, custom verifiers, SOC2
Managed Service	€8,000 – €25,000 / month	No ops team	Fully managed by us

Volume discounts available for 3+ clusters.
All prices exclude hardware.

Interested in a private demo?
📅 Book a 30-min PoC Call | ✉️ Contact Sales

Architecture

CLI / SDK / Dashboard
        ↓
   API Gateway (FastAPI · Auth · Rate Limiting)
        ↓
┌─────────────────┐  ┌───────────────────┐
│  Mythos Safe    │  │  TurboQuant INT4  │
│  Verifiers ·    │  │  vLLM/llama.cpp   │
│  Audit Trail    │  │  Inference Engine │
└─────────────────┘  └───────────────────┘
        ↓
   Memory & RAG (TurboMemory · pdf2struct)
        ↓
┌──────────┐ ┌──────────┐ ┌──────────┐
│  K3s     │ │Monitoring│ │ Storage  │
│  Cluster │ │Prom/Graf │ │ PG/Redis │
└──────────┘ └──────────┘ └──────────┘

Demo

TurboPrivate AI deployment demo

Documentation

Architecture — Full system design
Deployment — Production deployment guide
CLI Reference — All CLI commands
API Reference — FastAPI routes
Safety Gate — Verifier configuration
Demo Assets — GIF recording tape + deploy script
SAP HANA RAG — LangChain + HANA vector store integration
SAP HANA Integration Guide — Cost calculator, security checklist, BYOM & compliance

Integrations

Changelog

0.1.6 (2026-05-16)

SAP HANA integration guide: cost calculator, security checklist, BYOM in AI Core, Med/Fintech compliance
Enterprise hardening best practices for self-hosted LLM + vector database deployments

0.1.5 (2026-05-16)

SAP HANA vector store integration example (LangChain + HanaDB + TurboPrivate AI RAG)
FastAPI RAG endpoint with similarity search + LLM generation
Document ingestion script with PDF/text support + HNSW index creation

0.1.4 (2026-05-13)

Production-hardened Helm charts (configmap, ingress, services templates)
Enhanced rate limiter with token bucket algorithm + per-route limits
Improved safety gate middleware with pre/post-flight hook chain
Realtime metrics visualization in dashboard endpoint
TurboQuant v3 quantization pipeline: AWQ + INT4 mixed-precision
Backup/restore CLI with age-encrypted snapshots
K3s provisioner with multi-node discovery + node labels
vLLM backend: speculative decoding toggle + prefix caching
llama.cpp backend: flash attention + GPU offloading
Worker refinements: quantize retry, eval timeout, ingestion dedup
CLI enhancements: model status, deploy progress, backup summary
PII detector regex expansion (passport, SSN, phone variants)
Vulnerability verifier: CVE-2025 scoring + dependency jail status
PDF/image ingestion with OCR fallback in RAG pipeline

0.1.3 (2026-05-13)

Extended demo GIF to 61s with 5-scene animation (intro, deploy, serve+chat, safety block, dashboard)
Switched README GIF to absolute GitHub raw URL for PyPI rendering

0.1.2 (2026-05-11)

Enterprise-ready README with pricing table and benchmarks
Added docs/ARCHITECTURE.md with system design diagrams
Added docs/DEPLOYMENT.md with production deployment guide
Added examples/ with HTTP, safety, RAG, and quantization samples
Added .env.example with all configuration options
Added benchmarks/ with RTX 4090 performance results
Switched license from MIT to Apache 2.0
Added turbo doctor CLI command for system health checks
Added GitHub Actions Docker build workflow
Updated pyproject.toml with full install extra

0.1.1 (2026-05-11)

Migrated to hatchling build system
Fixed missing InferenceEngine import in turbo.inference
Fixed TracerProvider bug in OpenTelemetry instrumentation
Added structured logging to all exception handlers
Consolidated Celery workers into shared worker.celery_app
Added CI workflow with ruff linting + pytest
Improved graceful shutdown (audit trail flush)
Updated dependencies (replaced unstructured with actual used libs)

License

Apache 2.0 — see LICENSE.

Built by Kubenew — ex-HPE engineer, 12+ years enterprise infrastructure

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.1.9

May 17, 2026

0.1.8

May 17, 2026

0.1.7

May 17, 2026

This version

0.1.6

May 17, 2026

0.1.5

May 16, 2026

0.1.4

May 13, 2026

0.1.3

May 13, 2026

0.1.2

May 11, 2026

0.1.1

May 11, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

turboprivate_ai-0.1.6.tar.gz (1.1 MB view details)

Uploaded May 17, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

turboprivate_ai-0.1.6-py3-none-any.whl (55.6 kB view details)

Uploaded May 17, 2026 Python 3

File details

Details for the file turboprivate_ai-0.1.6.tar.gz.

File metadata

Download URL: turboprivate_ai-0.1.6.tar.gz
Upload date: May 17, 2026
Size: 1.1 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for turboprivate_ai-0.1.6.tar.gz
Algorithm	Hash digest
SHA256	`e1d598380088acd50eb26a465b759abec0c8c45e8c11401df9afa31a1a72da10`
MD5	`f5efb0087e0f179d7ab1aaa6b7c7db05`
BLAKE2b-256	`38c6a9b83d469b74e288cf6855b210861c96a242b4a34cf3bf6f3e00e1e2aa0e`

See more details on using hashes here.

File details

Details for the file turboprivate_ai-0.1.6-py3-none-any.whl.

File metadata

Download URL: turboprivate_ai-0.1.6-py3-none-any.whl
Upload date: May 17, 2026
Size: 55.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for turboprivate_ai-0.1.6-py3-none-any.whl
Algorithm	Hash digest
SHA256	`8769765b81c15d80cf08270cd3ef1653e0aa2c4dce084df3de119c0387cb1ef5`
MD5	`dc31170ab8dd876cab3db6d7e1e4bc34`
BLAKE2b-256	`48d4b88169f1f088b1eb4e4a8d21454891e2c65c849e8953c31c449945091d94`

See more details on using hashes here.

turboprivate-ai 0.1.6

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

TurboPrivate AI — Private & Safe Enterprise AI Platform

Why TurboPrivate AI?

Key Features

Performance (RTX 4090)

Quick Start

Pricing

Architecture

Demo

Documentation

Integrations

Changelog

0.1.6 (2026-05-16)

0.1.5 (2026-05-16)

0.1.4 (2026-05-13)

0.1.3 (2026-05-13)

0.1.2 (2026-05-11)

0.1.1 (2026-05-11)

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes