Skip to main content

Unified platform for self-hosted LLM inference + enterprise safety governance

Project description

TurboPrivate AI — Self-Hosted Enterprise AI Platform

Switch from OpenAI in 30 seconds. Drop-in compatible API with built-in safety, governance, and 40–60% cost reduction.

PyPI Python CI Downloads License Security

Run powerful LLMs on your own hardware — with enterprise safety, governance, and full data sovereignty.


Quick Start

One-Click Install

curl -fsSL https://get.turboprivate.ai | bash

Or via pip

pip install turboprivate-ai
turbo deploy --provider bare-metal --gpu auto
turbo model serve meta-llama/Llama-3.1-8B --quant int4
turbo chat

Docker Compose (Hardware-Aware)

git clone https://github.com/Kubenew/turboprivate-ai.git
cd turboprivate-ai

# Auto-detects GPU / Apple Silicon / CPU
curl -fsSL https://get.turboprivate.ai | bash

# Or manually:
docker compose -f docker-compose.gpu.yml up -d    # NVIDIA GPU
docker compose -f docker-compose.mac.yml up -d     # Apple Silicon
docker compose -f docker-compose.cpu.yml up -d     # CPU fallback

Why TurboPrivate AI?

Feature TurboPrivate AI Ollama vLLM OpenAI API
Data Sovereignty ✅ Full ✅ Full ✅ Full ❌ Cloud
Enterprise Safety ✅ Mythos Safe (7 verifiers) ❌ None ❌ None ⚠️ Basic
OpenAI Compatible ✅ 100% ✅ Partial ✅ Partial ✅ Native
INT4/AWQ Quantization ✅ TurboQuant v3 ✅ GGUF ✅ AWQ N/A
RAG Pipeline ✅ Built-in ❌ External External ❌ External
Audit Trail ✅ Immutable JSONL ❌ None ❌ None ⚠️ Limited
RBAC / Multi-tenant ✅ Enterprise ❌ None ❌ None ✅ Enterprise
Kubernetes Native ✅ Helm + K3s ❌ Manual ⚠️ Manual N/A
Cost (RTX 4090) ~8x cheaper Free Free $5-10/M tokens

🏢 For Enterprises

TurboPrivate AI is the Enterprise On-Premise AI Gateway — a secure, compliant orchestration layer between your corporate data and open-source models.

What We Are

  • Secure Wrapper: Mythos Safe gate with 7 verifiers (injection, PII, toxicity, etc.)
  • OpenAI Parity: 100% compatible API — swap base_url and you're done
  • SAP HANA Native: Direct, secure RAG connector with SQL injection guard + RLS
  • Audit & Compliance: Immutable JSONL logs, GDPR/HIPAA/SOC 2 ready
  • Hardware Agnostic: GPU, Apple Silicon, or CPU — auto-optimized

What We Are Not

  • Model Training: We don't train models from scratch
  • Custom UI: We integrate Open WebUI / LibreChat instead of building our own
  • Vector DB: We connect to Qdrant, Milvus, pgvector — we don't replace them

See docs/ENTERPRISE.md for architecture details.

Security & Compliance

  • Full data sovereignty: Nothing leaves your infrastructure
  • Mythos Safe: 7-layer defense (injection, PII, toxicity, hallucination, etc.)
  • Audit trail: Immutable JSONL logs with SIEM integration
  • RBAC: Fine-grained access control with OIDC/SAML support
  • Compliance ready: GDPR, HIPAA, SOC 2, PCI-DSS, ISO 27001

See SECURITY.md and docs/COMPLIANCE.md for details.

Enterprise Integrations

  • SAP HANA: Vector store + RAG pipeline (Guide)
  • SAP AI Core: BYOM deployment support
  • Kubernetes: Helm charts, HPA, multi-cluster
  • Observability: Prometheus, Grafana, OpenTelemetry
  • Secrets: HashiCorp Vault, AWS Secrets Manager, K8s Secrets

Support & SLAs

Tier Response Includes
Community GitHub Issues OSS core, docs, community support
PoC / Pilot 48h 4-8 week trial, 2 models, training
Enterprise 4h SLA 99.5%, unlimited models, TAM
Enterprise Plus 1h Multi-cluster, custom verifiers, SOC2

📅 Book a 30-min PoC Call | ✉️ Contact Sales


📊 Performance (RTX 4090)

Model Quant Tokens/sec VRAM Cost vs Cloud
Llama 3.1 8B INT4 110+ ~5.8 GB ~8x cheaper
Qwen2.5 32B INT4 45+ ~22 GB ~6x cheaper
Llama 3.1 70B INT4 18+ ~48 GB ~5x cheaper

Independent benchmarks: benchmarks/


🛡️ Architecture

CLI / SDK / Dashboard
        ↓
   API Gateway (FastAPI · Auth · Rate Limiting)
        ↓
┌─────────────────┐  ┌───────────────────┐
│  Mythos Safe    │  │  TurboQuant INT4  │
│  Verifiers ·    │  │  vLLM/llama.cpp   │
│  Audit Trail    │  │  Inference Engine │
└─────────────────┘  └───────────────────┘
        ↓
   Memory & RAG (TurboMemory · pdf2struct)
        ↓
──────────┐ ┌──────────┐ ┌──────────┐
│  K3s     │ │Monitoring│ │ Storage  │
│  Cluster │ │Prom/Graf │ │ PG/Redis │
└────────── └──────────┘ └──────────┘

🎬 Demo

TurboPrivate AI deployment demo


Documentation


🔄 Changelog

0.1.8 (2026-05-17)

  • SAP HANA Secure RAG Connector: SQL injection guard, RLS mapping, PII masking
  • Hardware-aware installer: auto-detects NVIDIA / Apple Silicon / CPU
  • Docker Compose profiles: gpu.yml, mac.yml, cpu.yml for optimal deployment
  • README overhaul: "What We Are / Are Not" transparency, Enterprise Gateway positioning
  • Modular architecture: decoupled inference backends, plug-and-play vector DBs

0.1.7 (2026-05-17)

  • SECURITY.md with threat model, hardening guide, SBOM, responsible disclosure
  • CONTRIBUTING.md with dev setup, testing, PR guidelines
  • Enterprise Deployment Guide: air-gapped, HA, secrets, proxy, hardware sizing
  • Compliance readiness: GDPR, HIPAA, SOC 2, PCI-DSS, ISO 27001, EU AI Act
  • One-click installer (install.sh) + docker-compose.full.yml with GPU passthrough
  • GitHub issue templates: bug report, feature request, security report
  • README overhaul: feature comparison table, "For Enterprises" section, badges

0.1.6 (2026-05-16)

  • SAP HANA integration guide: cost calculator, security checklist, BYOM, compliance
  • Enterprise hardening best practices
  • SECURITY.md and CONTRIBUTING.md added

0.1.5 (2026-05-16)

  • SAP HANA vector store integration (LangChain + HanaDB)
  • FastAPI RAG endpoint with similarity search
  • Document ingestion with PDF/text + HNSW index

0.1.4 (2026-05-13)

  • Production Helm charts (configmap, ingress, services)
  • TurboQuant v3: AWQ + INT4 mixed-precision
  • K3s provisioner with multi-node discovery
  • vLLM backend: speculative decoding + prefix caching

Full changelog →


📄 License

Apache 2.0 — see LICENSE.


Built by Kubenew — ex-HPE engineer, 12+ years enterprise infrastructure

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

turboprivate_ai-0.1.8.tar.gz (1.1 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

turboprivate_ai-0.1.8-py3-none-any.whl (58.1 kB view details)

Uploaded Python 3

File details

Details for the file turboprivate_ai-0.1.8.tar.gz.

File metadata

  • Download URL: turboprivate_ai-0.1.8.tar.gz
  • Upload date:
  • Size: 1.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for turboprivate_ai-0.1.8.tar.gz
Algorithm Hash digest
SHA256 eb1b68704ed84fe81a3b94bc4a406411f979f525bda88d9103e18415c100aa76
MD5 2c5dfe5ab5175b64fa5d04b2f90739de
BLAKE2b-256 a3c4436b65c40f65603bca07afa7b188a32847341d02a61f1700b6fa02985bd3

See more details on using hashes here.

File details

Details for the file turboprivate_ai-0.1.8-py3-none-any.whl.

File metadata

File hashes

Hashes for turboprivate_ai-0.1.8-py3-none-any.whl
Algorithm Hash digest
SHA256 98a9deb36f1f1ce279b15a4cd64eb16a59cca95f5d1a14b70eeda985cfff2268
MD5 14e02bff85ce8346c98d74fec765dc6f
BLAKE2b-256 a6fc35065881485daa8554dcf68e2ff768862bdd36bcceca3dce514f89ef522d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page