A dynamic zero-token semantic router

These details have not been verified by PyPI

Project links

Project description

SynaptoRoute

SynaptoRoute is a high-throughput, local semantic routing engine built for production Python microservices. Designed as a mathematically optimal alternative to Large Language Model (LLM) routing chains and slower local routers, it provides zero-token intent classification in under 3 milliseconds on standard cloud hardware.

Why SynaptoRoute?
Architecture & Optimizations
Performance Benchmarks
Installation & Deployment
Quick Start Guide
System Limitations
Community & Contributing

Why SynaptoRoute?

In modern agentic systems, relying on an external API (like OpenAI or Anthropic) to make simple routing decisions—such as determining if a user wants to reset their password or check their balance—introduces unacceptable latency (300ms+) and high token costs.

SynaptoRoute solves this by executing intent classification entirely locally using INT8 quantized vector embeddings.

SynaptoRoute was engineered specifically to solve the $O(N)$ memory degradation problem during live hot-reloading and to maximize hardware utilization via asynchronous dynamic batching.

Architecture & Optimizations

1. Lazy Memory Compilation

Traditional routers suffer from severe performance degradation during live updates. When a new route is added, they execute an immediate numpy.vstack, copying the entire vector array in memory ($O(N)$ complexity). SynaptoRoute defers this reallocation, appending new vectors to a lightweight list ($O(1)$) and only executing the heavy compilation precisely when the next query arrives, preventing server freezes.

2. Dynamic Asynchronous Batching

Hardware accelerators (GPUs, AVX512 CPUs) are optimized for large matrix multiplications. Sending single queries sequentially incurs massive transfer overhead. SynaptoRoute utilizes a background asyncio.Queue worker that traps parallel HTTP requests, waits 5 milliseconds, groups them into a batch, and processes them in a single hardware cycle.

3. INT8 Quantization

By default, SynaptoRoute leverages the BAAI/bge-small-en-v1.5 model quantized to 8-bit integers via the ONNX runtime, slashing memory bandwidth requirements by 4x and maximizing CPU cache utilization.

Performance Benchmarks

The following metrics were captured via automated GitHub Actions CI/CD running on a standard, unaccelerated ubuntu-latest 2-core cloud CPU.

Metric	Cloud CPU Latency	Context
Inference P99	3.94 ms	Single sequential query latency.
Amortized P50	2.69 ms	Per-query latency when processing 1,000 concurrent requests via dynamic batching.
Hot-Reload	5.04 ms	Time required to dynamically inject a new utterance into memory without dropping active API requests.

📊 View Full Benchmarks: For detailed analysis including Memory Leak Endurance, GPU Scaling, Classification F1-Scores, and Input Poisoning Survival Metrics, see our official BENCHMARKS.md.

Installation & Deployment

Method 1: Docker REST API (Recommended)

SynaptoRoute ships with a fully asynchronous FastAPI wrapper, designed for immediate drop-in deployment as a scalable microservice.

# Build the Docker image
docker build -t synaptoroute .

# Run the container
docker run -p 8000:8000 synaptoroute

You can interface with the router immediately:

curl -X POST http://localhost:8000/route \
     -H "Content-Type: application/json" \
     -d '{"query": "I need help resetting my password"}'

Method 2: Standard Python Package

To embed SynaptoRoute natively into your existing Python pipelines, install directly from pip (or via git if testing the latest main branch):

pip install synaptoroute

Quick Start Guide

import asyncio
from synaptoroute.router import AdaptiveRouter
from synaptoroute.encoder import Encoder
from synaptoroute.storage import SQLiteStorage
from synaptoroute.models import Route

async def main():
    # 1. Initialize Components
    encoder = Encoder()
    storage = SQLiteStorage("data/memory.sqlite")
    router = AdaptiveRouter(encoder, storage)
    
    # 2. Define Routes
    billing_route = Route(
        name="billing", 
        utterances=["I need a refund", "Where is my receipt?", "Cancel my subscription"]
    )
    router.add_route(billing_route)
    
    # 3. Start the Background Batching Worker
    await router.start()
    
    # 4. Execute Async Queries
    result = await router.aquery("How do I get my money back?")
    print(f"Matched Intent: {result.name}") # Output: billing
    
    # 5. Graceful Shutdown
    await router.stop()

if __name__ == "__main__":
    asyncio.run(main())

System Limitations

Horizontal Scaling (Kubernetes Split-Brain)
SynaptoRoute relies on a highly optimized, local in-memory NumPy matrix to achieve its microsecond latency. As such, it is structurally bound to a single node. If deployed across multiple load-balanced Kubernetes pods, a hot-reload request hitting Pod A will update Pod A's local memory, but Pod B will remain unaware. Scaling horizontally requires implementing an external event bus (e.g., Redis Pub/Sub) to broadcast memory invalidation events across the cluster.

Community & Contributing

We welcome contributions of all sizes from the open-source community!

Contributing: Please read our Contributing Guidelines to learn how to set up your development environment, run the test suite, and submit Pull Requests.
Code of Conduct: We are committed to fostering a welcoming environment. Please review our Code of Conduct.
Issues: If you discover a bug or have a feature request, please open an issue.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.3.1

Jun 1, 2026

0.3.0

Jun 1, 2026

0.2.0

May 28, 2026

This version

0.1.0

May 27, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

synaptoroute-0.1.0.tar.gz (26.5 kB view details)

Uploaded May 27, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

synaptoroute-0.1.0-py2.py3-none-any.whl (11.2 kB view details)

Uploaded May 27, 2026 Python 2Python 3

File details

Details for the file synaptoroute-0.1.0.tar.gz.

File metadata

Download URL: synaptoroute-0.1.0.tar.gz
Upload date: May 27, 2026
Size: 26.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for synaptoroute-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`31ebb5901c9e6d9d0c87de50705cefc6d75351e03e79dbf9786efe9448ee30ab`
MD5	`842cb8ec12a00d63cff8d816b165f7d8`
BLAKE2b-256	`8cdf470fce0a5dbb8651ad1966a69521bd2eb66d4a6f21ba3fc50ad85823fb11`

See more details on using hashes here.

File details

Details for the file synaptoroute-0.1.0-py2.py3-none-any.whl.

File metadata

Download URL: synaptoroute-0.1.0-py2.py3-none-any.whl
Upload date: May 27, 2026
Size: 11.2 kB
Tags: Python 2, Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for synaptoroute-0.1.0-py2.py3-none-any.whl
Algorithm	Hash digest
SHA256	`44ef568f66ab5a1664a3f74bb71028259b446d7c1624d52c0e8725d3e570634a`
MD5	`93b2bc1879aaf59682eb18e07b6ec602`
BLAKE2b-256	`04478a4c1e7cae1ae725f517d68d5bf822a1d4ac2086220d504b0eac71c948f1`

See more details on using hashes here.

synaptoroute 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

SynaptoRoute

Table of Contents

Why SynaptoRoute?

Architecture & Optimizations

1. Lazy Memory Compilation

2. Dynamic Asynchronous Batching

3. INT8 Quantization

Performance Benchmarks

Installation & Deployment

Method 1: Docker REST API (Recommended)

Method 2: Standard Python Package

Quick Start Guide

System Limitations

Community & Contributing

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes