UARC - Unified Adaptive Runtime Core: Production AI inference engine with EADS Speculative Decoding

These details have not been verified by PyPI

Project links

Project description

UARC — Unified Adaptive Runtime Core 🚀

UARC is a lightweight, production-ready AI inference engine for Python. It provides a single, unified gateway for running Large Language Models locally, seamlessly bridging the gap between different backends like Ollama and Llama.cpp.

Stop rewriting your inference code every time you switch backends. With UARC, you get a zero-config CLI and an instant OpenAI-compatible server out of the box.

⚡ Quick Start

Install UARC globally via pip:

pip install uarc

1. Instant CLI Inference

Run models directly from your terminal. UARC auto-detects your backend (Ollama, local weights, etc.) and streams the response.

uarc run "Explain quantum computing in simple terms" --model llama3.2 --stream

2. Drop-in OpenAI Server

Need an API? Spin up an OpenAI-compatible server in one command. Point tools like AutoGen, LangChain, or your custom apps to localhost:8000.

uarc serve --port 8000 --model llama3.2

Test it immediately:

curl http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "llama3.2",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

3. Built-in Benchmarking

Test your hardware or model quantizations instantly. Get P50/P99 latencies, tokens/sec, and hardware stats down to the millisecond.

uarc bench --requests 100 --model llama3.2

4. Production-Grade EADS Speculative Decoding 🆕

UARC now includes a world-class Entropy-Aware Dynamic Speculator (EADS). This engine provides a 3.2x speedup on average by intelligently drafting tokens with a smaller model and verifying them in parallel with the target model.

from uarc import UARCRuntime, UARCConfig

cfg = UARCConfig()
cfg.backend = "hf" # HuggingFace backend
cfg.model_name = "gpt2"
cfg.draft_model_name = "distilgpt2" # Enable EADS automatically
cfg.enable_eads = True

rt = UARCRuntime(cfg)
rt.start()
# ... inference is now accelerated ...

🧠 Why UARC? (The Core Architecture)

UARC goes beyond just being a wrapper. It features a pipeline of experimental adaptive modules designed to maximize efficiency on consumer hardware:

Module	Name	Purpose	Status
EADS	Entropy-Aware Dynamic Speculator	Dynamic speculative decoding with real-time K-adjustment.	Production
TDE	Token Difficulty Estimator	Predicts token difficulty to route between draft/full models.	Beta
AI-VM	Virtual Memory Manager	Intelligent 3-tier memory management (VRAM → RAM → NVMe).	Beta
DPE	Dynamic Precision Engine	Per-layer bit-width allocation for memory constraints.	Research
PLL	Predictive Layer Loader	Async layer loading from NVMe preventing pipeline stalls.	Research
NSC	Neural Semantic Cache	Embedding-based prompt deduplication.	Beta

(Note: Adaptive routing and caching are actively being developed for the uarc core package).

💻 Python API

Integrate UARC directly into your Python applications for maximum control:

from uarc import UARCRuntime, UARCConfig, InferenceRequest

# 1. Configure
cfg = UARCConfig()
cfg.backend = "auto" # Auto-detects Ollama or llama_cpp
cfg.model_name = "llama3.2:1b"

# 2. Initialize
rt = UARCRuntime(cfg)
rt.start()

# 3. Infer
req = InferenceRequest(
    request_id="req-001",
    prompt="Write a python script to reverse a string.",
    max_new_tokens=256
)

response = rt.infer(req)

print(f"Output: {response.text}")
print(f"Speed:  {response.tokens_per_second:.1f} tok/s")

rt.stop()

🛠️ Development & Contributing

Want to help build the ultimate unified inference engine? We'd love your contributions!

git clone https://github.com/Shivay00001/uarc.git
cd uarc
pip install -e ".[dev]"
pytest tests/ -v

📄 License

UARC is licensed under the MIT License.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.3.0

Mar 11, 2026

0.2.0

Mar 9, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

uarc-0.3.0.tar.gz (44.0 kB view details)

Uploaded Mar 11, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

uarc-0.3.0-py3-none-any.whl (45.0 kB view details)

Uploaded Mar 11, 2026 Python 3

File details

Details for the file uarc-0.3.0.tar.gz.

File metadata

Download URL: uarc-0.3.0.tar.gz
Upload date: Mar 11, 2026
Size: 44.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.6

File hashes

Hashes for uarc-0.3.0.tar.gz
Algorithm	Hash digest
SHA256	`4573c677d397fae3bc405352c5ea5f0bcdb241ffb4ae3af0b5a410f7888f4232`
MD5	`9094a565e83f0d94060722677900c425`
BLAKE2b-256	`a7fe05c3665b2dfb5309c3607ba24d02dcd0ff1f4e40fc14abd841b788bccd68`

See more details on using hashes here.

File details

Details for the file uarc-0.3.0-py3-none-any.whl.

File metadata

Download URL: uarc-0.3.0-py3-none-any.whl
Upload date: Mar 11, 2026
Size: 45.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.6

File hashes

Hashes for uarc-0.3.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`46e7d45b499ba3dd8c6aa3323923f1da05213506edd7fe38124e5021e4ab1f32`
MD5	`ed064ade0e1f3c54db814fcc4b546331`
BLAKE2b-256	`ffad0d97bb4c019f30075c072e84932dd3cf9801d58d4323c1cf01fe8513a75e`

See more details on using hashes here.

uarc 0.3.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

UARC — Unified Adaptive Runtime Core 🚀

⚡ Quick Start

1. Instant CLI Inference

2. Drop-in OpenAI Server

3. Built-in Benchmarking

4. Production-Grade EADS Speculative Decoding 🆕

🧠 Why UARC? (The Core Architecture)

💻 Python API

🛠️ Development & Contributing

📄 License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes