Skip to main content

Retrieval-first, deterministic RAG infrastructure

Project description

Scaraflow Logo

Scaraflow

Retrieval-first, deterministic RAG infrastructure for production systems


What is Scaraflow?

Scaraflow is a retrieval-first RAG infrastructure designed for deterministic, low-variance, production-grade Retrieval-Augmented Generation.

Scaraflow is not:

  • an agent framework
  • a prompt playground
  • a chain-orchestration SDK

Scaraflow focuses on one problem only:

Correct, explicit, and scalable retrieval for LLM systems


Why Scaraflow Exists

Most modern RAG frameworks optimize for:

  • orchestration flexibility
  • feature breadth
  • rapid prototyping

Scaraflow optimizes for:

  • retrieval correctness
  • predictable latency
  • streaming readiness
  • infrastructure consistency

Scaraflow treats retrieval as infrastructure, not glue code.


Design Principles

  • Retrieval before generation
  • Explicit contracts over hidden magic
  • Deterministic behavior
  • Low-variance latency
  • Streaming-ready by design
  • Same semantics in notebooks, services, and production

Architecture Overview

scaraflow/
├── scara-core        # strict contracts & invariants
├── scara-index       # vector store backends (Qdrant)
├── scara-rag         # deterministic RAG engine
├── scara-live        # streaming / temporal RAG (planned)
├── scara-graph       # graph-based RAG (planned)
└── scara-llm         # thin LLM adapters (planned)

Installation

pip install scaraflow

Dependencies

  • qdrant-client
  • sentence-transformers
  • standard scientific Python stack

Quick Start (30 Seconds)

In-Memory Setup (No Docker)

import uuid
from qdrant_client import QdrantClient
from sentence_transformers import SentenceTransformer
from scara_index.qdrant_store import QdrantVectorStore
from scara_index.config import QdrantConfig
from scara_rag.engine import RAGEngine
from scara_rag.policies import RetrievalPolicy

# In-process Qdrant
client = QdrantClient(":memory:")
store = QdrantVectorStore(
    QdrantConfig(collection="demo", vector_dim=384),
    client=client
)

model = SentenceTransformer("all-MiniLM-L6-v2")

class Embedder:
    def embed(self, text):
        return model.encode(text).tolist()

rag = RAGEngine(
    embedder=Embedder(),
    store=store,
    llm=lambda _: "Demo answer",
)

documents = [
    "Scaraflow is retrieval-first.",
    "It prioritizes deterministic behavior.",
    "Qdrant is the reference backend.",
]

ids = [str(uuid.uuid4()) for _ in documents]
vectors = model.encode(documents).tolist()

store.upsert(
    ids=ids,
    vectors=vectors,
    metadata=[{"text": d} for d in documents],
)

response = rag.query(
    "What does Scaraflow prioritize?",
    policy=RetrievalPolicy(top_k=2),
)

print(response.answer)

Production Setup (Docker / Cloud)

docker run -p 6333:6333 qdrant/qdrant
store = QdrantVectorStore(
    QdrantConfig(
        url="http://localhost:6333",
        collection="prod_v1",
        vector_dim=384,
    )
)

Benchmarks

Documents        : 10000
Queries          : 100
Embedding Time   : 6.47s
Indexing Time    : 0.34s
Avg Latency      : 7.92 ms
P95 Latency      : 11.03 ms
Latency Std Dev  : 1.24 ms

Benchmarks can be run using:

python testing/benchmarks.py

License

MIT License


Author

Built and maintained by Ganesh (K. S. N. Ganesh).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scaraflow-0.1.7.tar.gz (12.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

scaraflow-0.1.7-py3-none-any.whl (15.3 kB view details)

Uploaded Python 3

File details

Details for the file scaraflow-0.1.7.tar.gz.

File metadata

  • Download URL: scaraflow-0.1.7.tar.gz
  • Upload date:
  • Size: 12.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for scaraflow-0.1.7.tar.gz
Algorithm Hash digest
SHA256 0ae97fda514ffc519e6fb41fdb175a8eea9c4c5baa135e201301f90dd31b3d8b
MD5 e32afe64212d903db5b97bb471a5b344
BLAKE2b-256 b3063df1b60dfd82388f0931343df84ff613c7672d6580749c11985778a945e7

See more details on using hashes here.

File details

Details for the file scaraflow-0.1.7-py3-none-any.whl.

File metadata

  • Download URL: scaraflow-0.1.7-py3-none-any.whl
  • Upload date:
  • Size: 15.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for scaraflow-0.1.7-py3-none-any.whl
Algorithm Hash digest
SHA256 1e01856648ae36225fc172d385155ab708c3ec8438548a1ac952ed4a0c9a83e4
MD5 d1de8c1dd0012d117dd1e99c6891bfcc
BLAKE2b-256 02e3def37663d6a1ba5d178a86223011c1d5ede1c6f270ac015b23a81847acd8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page