Privacy-first LLM observability and budget control — full AI cost visibility with zero prompt storage.

These details have not been verified by PyPI

Project links

Project description

DoCoreAI

The Privacy-First AI Cost & Governance Platform

🚧 Prototype Stage — Built on 17 months of research and validated against waste patterns observed across 20+ enterprise AI engagements. Closed source from v2.0. SOC2 certification in progress. Early adopters and design partners welcome.

The Problem Enterprise AI Teams Face Today

Your team ships an AI feature. Usage grows. Then one of two things happens.

Option A — You log everything. Prompts, responses, full request bodies. Now your compliance team is blocking the rollout. Legal wants a data retention policy. Security calls it a liability. The AI pilot stalls.

Option B — You skip logging. No compliance risk, but now you're flying blind. Costs spike overnight with no warning. A single bad deployment burns through the monthly budget by Tuesday. There's no audit trail, no governance, no way to explain the bill.

And even when logging is solved, the spending problem remains. LLM costs do not self-regulate. Without active budget control, a traffic surge at 10 AM can exhaust the entire day's budget by noon — leaving your service unavailable for the remaining 14 hours. There are no native guardrails in any LLM provider SDK. No pacing, no prediction, no automatic intervention. You either overspend or you build all of that yourself.

Enterprise AI teams have been forced to choose between privacy and visibility — and then left to solve cost control entirely on their own. DoCoreAI eliminates all three problems.

What DoCoreAI Does

Your office pays the electricity bill based on peak circuit capacity — not what you actually consumed. Most LLM deployments work the same way. The default max_tokens ceiling is set high as a safety net, and you pay for every token up to that ceiling whether the response needed it or not. The waste is silent, automatic, and compounds with every request.

DoCoreAI sits between your application and any LLM provider — OpenAI, Anthropic, Google, Groq, AWS Bedrock, Ollama — and acts as an autonomous cost and governance layer. It learns your actual usage patterns, predicts what each request genuinely needs, replaces the wasteful ceiling with a precise prediction, and paces your budget across the full day. Your prompts never leave your network. Only cost and token metadata is aggregated — nothing sensitive, ever.

Projected outcome: 40–70% reduction in LLM spend, based on token waste patterns observed across 20+ enterprise AI deployments. Enterprise pilots will validate this at scale.

See It in Action

▶ Watch the demo on YouTube

Architecture — Privacy by Design

Alt Text
DoCoreAI runs as a sidecar in the same Python environment as your application. It monkey-patches all active LLM SDK calls automatically at startup — no imports, no wrappers, no changes to your existing code. Every request passes through governance checks, token prediction, budget validation, pacing adjustment, and soft limit injection before the LLM call is made. After the response arrives, actuals are compared to predictions and fed back into the learning loop. Your application sees none of this. It just gets the response.

Why DoCoreAI Is an Agentic Platform

Most observability tools watch and report. DoCoreAI watches and acts.

Multi-step reasoning. Budget decisions weigh spend rate, prediction accuracy, model drift, team quotas, and policy rules simultaneously — not a single threshold trigger.

Autonomous task execution. The pacing engine, soft limit injector, auto-retrain triggers, and governance blocks all fire without human approval, in real time, on every request.

Predictive budget management. DoCoreAI learns your historical spending patterns over 30 days, builds a per-hour prediction model, distributes your budget intelligently, and auto-corrects when actual usage deviates from the learned baseline.

Continuous self-improvement. When prediction accuracy degrades, the system detects drift, retrains the LightGBM model on recent telemetry, runs an A/B test against the previous champion, and promotes the better model automatically — all without any action from your team.

How the Cost Reduction Works

The default max_tokens on most LLM calls is set to 2,000 or higher — a safe ceiling, not a real estimate. Most responses need a fraction of that. You pay for the ceiling.

DoCoreAI replaces that ceiling with a prediction.

At 30,000 requests per month, that single optimisation saves approximately $990/month — before pacing, soft limits, or governance kick in.

Quick Start

Requirements: Python 3.12+ · pip · A free org token from docoreai.com

Supported platforms: Actively developed and tested on Windows.

macOS and Linux should work without issues — the codebase is pure Python — but has not yet been formally tested. Please report any platform issues at docoreai.com/docs.

Step 1 — Install DoCoreAI in the same environment as your application:

pip install docoreai

Step 2 — Generate your org token at docoreai.com and configure DoCoreAI:

docoreai config

Client Token Setup

Step 3 — Start DoCoreAI alongside your application:

docoreai start

DoCoreAI automatically intercepts all LLM SDK calls in your environment. No changes to your application code are required.

What you see in your terminal confirming it's working: DoCoreAI working

No prompt content. No response content. Just the signal that matters.

To stop DoCoreAI:

Ctrl+C
# Graceful shutdown — active requests complete before exit

What the Dashboard Will Show

🔧 The cloud dashboard is under active development. The metrics below reflect what is being built toward. Local telemetry is fully operational today.

For the engineering manager or CTO reviewing AI spend:

Daily cost vs. budget — real-time spend curve against your set limit, by hour
Savings percentage — projected vs. actual token consumption across all providers
Budget pace status — on track, ahead, or over pace, with throttling events logged
Provider breakdown — cost split across OpenAI, Anthropic, Groq, and others

For the developer monitoring prediction quality:

Prediction accuracy (MAE) — mean absolute error across recent requests, trending over time
Cutoff rate — percentage of responses that reached the token ceiling, indicating under-prediction
Drift events — when the prediction model degraded and what triggered retraining
A/B test results — champion vs. challenger model performance, promotion or rollback decisions

Real-World Use Cases

Preventing budget exhaustion — SaaS platforms A marketing email triggers a customer surge. Without pacing, the daily budget is gone by 11 AM and the service is blocked for 13 hours. DoCoreAI detects the spike, applies graduated throttling, and keeps the service running for the full 24 hours on the same budget.

Handling seasonal traffic — e-commerce Black Friday volume runs 5× normal. DoCoreAI's peak-aware pacing strategy recognises the anomaly, allows temporary over-pace during the critical window, and compensates during off-hours — keeping spend within the planned monthly envelope.

Enterprise compliance — regulated industries A healthcare or financial services team needs full AI observability but cannot log prompt content. DoCoreAI's metadata-only telemetry delivers complete cost and governance visibility with zero prompt retention — nothing that touches a compliance boundary ever leaves the local environment.

Privacy vs. Visibility — How DoCoreAI Solves Both

Capability	Traditional APM	DoCoreAI
Prompt storage	✗ Required	✓ Never stored
Response storage	✗ Required	✓ Never stored
Cost tracking	✓	✓
Token-level visibility	Partial	✓ Per request
Autonomous budget control	✗	✓
Intelligent pacing	✗	✓
PII detection at edge	✗	✓
Multi-LLM support	Partial	✓
Compliance-ready architecture	✗	✓
Closed source / IP protected	✗	✓ from v2.0
SOC2	Varies	In progress

Supported Providers

Works out of the box with any combination:

OpenAI — GPT-4, GPT-4 Turbo, GPT-3.5, and all variants
Anthropic — Claude 3 family and newer
Google — Gemini Pro and Gemini Ultra
AWS Bedrock — all supported foundation models
Groq — Llama, Mixtral, and Groq-hosted models
Ollama — local model deployments

No provider-specific configuration needed. DoCoreAI detects and wraps all active SDKs automatically at startup.

Installation

# Install
pip install docoreai

# Configure with your org token (generate at docoreai.com)
docoreai config

# Start
docoreai start

# Stop
Ctrl+C

v2.1.0 is the current release. This is prototype-stage software. APIs may evolve between releases. Production use is welcomed — real usage data helps us improve the prediction models faster. Report issues and get support at docoreai.com/docs.

Documentation

Full documentation, configuration reference, and integration guides:

docoreai.com/docs

Covers: auto-retrain configuration · A/B testing · soft limits · pacing engine · budget modes · retention policies · troubleshooting · developer API reference.

Design Partner Program

We are actively seeking 3–5 enterprise design partners for no-cost pilots with white-glove founder support. If your team is running LLMs in production and wrestling with cost visibility or compliance constraints, this is built for you.

saji.john@docoreai.com

What design partners get: direct founder access · custom configuration support · early access to enterprise features · input into the product roadmap.

What we ask in return: honest feedback · real usage data · a willingness to co-develop the enterprise use case with us.

Built With

Python · FastAPI · LightGBM · SQLite · scikit-learn · Chart.js

DoCoreAI — because your compliance team and your engineering team should both be able to say yes.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

2.1.1

Jun 18, 2026

This version

2.1.0

Jun 8, 2026

2.0.0

Jun 5, 2026

2.0.0b2 pre-release

Jun 5, 2026

2.0.0b1 pre-release

Jun 5, 2026

1.0.1

Aug 4, 2025

1.0.0

Aug 2, 2025

1.0.0b2 pre-release

Aug 2, 2025

1.0.0b1 pre-release

Aug 2, 2025

0.3.6

Apr 22, 2025

0.3.5

Apr 20, 2025

0.3.4

Apr 14, 2025

0.3.3

Apr 11, 2025

0.3.2

Apr 11, 2025

0.3.1

Apr 10, 2025

0.3.0

Apr 9, 2025

0.2.9

Apr 9, 2025

0.2.8

Apr 2, 2025

0.2.7

Apr 2, 2025

0.2.6

Apr 2, 2025

0.2.5

Apr 2, 2025

0.2.4

Mar 27, 2025

0.2.3

Mar 27, 2025

0.2.2

Mar 27, 2025

0.2.1

Mar 26, 2025

0.2.0

Mar 23, 2025

0.1.9

Mar 20, 2025

0.1.8

Mar 17, 2025

0.1.7

Mar 17, 2025

0.1.6

Mar 17, 2025

0.1.5

Mar 16, 2025

0.1.4

Mar 15, 2025

0.1.3

Mar 15, 2025

0.1.2

Mar 12, 2025

0.1.1

Mar 12, 2025

0.1.0

Mar 12, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

docoreai-2.1.0.tar.gz (222.6 kB view details)

Uploaded Jun 8, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

docoreai-2.1.0-py3-none-any.whl (269.7 kB view details)

Uploaded Jun 8, 2026 Python 3

File details

Details for the file docoreai-2.1.0.tar.gz.

File metadata

Download URL: docoreai-2.1.0.tar.gz
Upload date: Jun 8, 2026
Size: 222.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for docoreai-2.1.0.tar.gz
Algorithm	Hash digest
SHA256	`b088439a5948808d8eb3201a91f089fc2c0d4cae14ec647cc46a7cb4400830aa`
MD5	`2a99331211e645db47a9411fcaa1b5bd`
BLAKE2b-256	`0839547e376303cc2c9ced52a02f911725b9b7d6d812caca4ed1591ad475825a`

See more details on using hashes here.

File details

Details for the file docoreai-2.1.0-py3-none-any.whl.

File metadata

Download URL: docoreai-2.1.0-py3-none-any.whl
Upload date: Jun 8, 2026
Size: 269.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for docoreai-2.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`197713331723189e02fe2ba5e7e608715ae4a3d526bbbf6f41d42b99b5ba40e9`
MD5	`6d265b2121d3fea0c6722c9f39c60eda`
BLAKE2b-256	`4b9d97dd0c3b786fce375adac7b9e68ea9ac6655e3c93a0f70e2fe1185bde8b4`

See more details on using hashes here.

docoreai 2.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

DoCoreAI

The Problem Enterprise AI Teams Face Today

What DoCoreAI Does

See It in Action

Architecture — Privacy by Design

Why DoCoreAI Is an Agentic Platform

How the Cost Reduction Works

DoCoreAI replaces that ceiling with a prediction.

Quick Start

What the Dashboard Will Show

Real-World Use Cases

Privacy vs. Visibility — How DoCoreAI Solves Both

Supported Providers

Installation

Documentation

Design Partner Program

Built With

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes