Faster and smarter Retrieval Augmented Generation using Speculative Retrieval and Context Tetris.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

darshmodii

These details have not been verified by PyPI

Project description

Quira

Lightning-Fast, Context-Dense RAG Framework for Python

Stop waiting. Start predicting.

Quickstart · How It Works · Benchmarks · API · Contributing

🔥 The Problem

Traditional RAG is slow and wasteful:

User types query → Hits Enter → WAIT → Vector search → WAIT → Stuff 10 chunks → WAIT → LLM response
                                 ⏱️ 1.5s avg latency, 65% of context is noise

✨ The Quira Solution

Quira predicts what users need before they finish typing, compresses context to maximize density, and tracks conversation state to eliminate redundant fetches:

User starts typing → Quira searches speculatively → User hits Enter → Context already cached!
                     → Differential fetch (only new chunks) → Context Tetris (compress + score)
                                 ⏱️ 210ms avg latency, 94% context density

📦 Quickstart

Install

pip install quira

Usage

import asyncio
from quira import quiraPipeline, UserSession

async def main():
    # Initialize with your own clients
    pipeline = quiraPipeline(
        qdrant_client=qdrant,
        redis_client=redis,
        groq_client=groq,
        embed_func=my_embed_func,
        spacy_model=my_spacy_model
    )

    session = UserSession(user_id="user_123")

    # 🏎️ Speculative fetch while user types
    await pipeline.handle_typing_event(session, "What is the re")

    # 🎯 Submit — context is already warm!
    answer = await pipeline.process_submission(
        session, "What is the return policy?"
    )
    print(answer)

asyncio.run(main())

Ingest PDFs

# Parse, chunk, embed, and store — one line.
chunks = await pipeline.ingestor.ingest_pdf("user_123", "docs/return_policy.pdf")
print(f"Indexed {chunks} chunks into Qdrant")

⚙️ How It Works

Quira is built on 4 core modules that work together as a unified pipeline:

🏎️ Module 1 — Speculative Retrieval

Listens to user keystrokes via WebSocket. Uses adaptive debouncing (250ms–600ms based on typing speed) to fire Qdrant searches before the user submits. Results are cached in Redis with SHA-256 hashed keys.

🧩 Module 2 — Context Tetris

Scores every chunk on 4 dimensions: Relevance, Recency, Uniqueness, and Density. Uses Groq LLM to compress filler text. Orders chunks in a U-shape (best chunks at start and end) to combat "Lost in the Middle" syndrome.

🔄 Module 3 — Differential Retrieval

Maintains a stateful Context Pool across conversation turns. Measures cosine similarity between consecutive queries. If similarity > 0.6, fetches only delta chunks. Garbage-collects stale context when topics shift.

📄 Module 4 — Document Ingestion

Parses PDFs with PyMuPDF. Splits text into overlapping chunks (1000 chars / 200 overlap by default) to prevent sentence fragmentation. Generates embeddings and upserts directly into Qdrant.

Architecture

┌──────────────────────────────────────────────────────────────┐
│                        QUIRA PIPELINE                        │
│                                                              │
│  ┌─────────────┐    ┌──────────────┐    ┌────────────────┐  │
│  │  Speculative │───▶│ Differential │───▶│ Context Tetris │  │
│  │  Retriever   │    │  Retriever   │    │  (Compress +   │  │
│  │  (Predict)   │    │  (Delta)     │    │   Score + Pack)│  │
│  └──────┬───────┘    └──────┬───────┘    └───────┬────────┘  │
│         │                   │                    │           │
│    ┌────▼────┐         ┌────▼────┐          ┌────▼────┐     │
│    │  Redis  │         │ Qdrant  │          │  Groq   │     │
│    │ (Cache) │         │(Vectors)│          │  (LLM)  │     │
│    └─────────┘         └─────────┘          └─────────┘     │
└──────────────────────────────────────────────────────────────┘

📊 Benchmarks

Metric	Traditional RAG	Quira	Improvement
Avg Latency	1,450 ms	210 ms	🚀 85% faster
Context Density	35%	94%	🧠 2.6× denser
Token Cost	Baseline	-40%	💰 40% cheaper
Redundant Fetches	Every turn	Delta only	♻️ ~70% fewer

📚 API Reference

`quiraPipeline(qdrant, redis, groq, embed_func, spacy_model)`

The main pipeline class. Accepts your own client instances.

Method	Description
`handle_typing_event(session, keystrokes)`	Trigger speculative retrieval on keystrokes
`process_submission(session, query)`	Full retrieval + compression pipeline
`ingestor.ingest_pdf(user_id, path)`	Parse, chunk, embed, and store a PDF
`ingestor.ingest_text(user_id, text)`	Chunk, embed, and store raw text

`UserSession(user_id, websocket=None)`

Tracks per-user conversation state, context pools, and turn history.

🔒 Security

Quira is regularly audited with Bandit (Python AST security linter):

✅ 0 vulnerabilities across all severity levels
✅ SHA-256 hashing for all cache keys (no weak hashes)
✅ No hardcoded secrets or credentials
✅ Safe file I/O with proper exception handling

🤝 Contributing

Contributions are welcome! Please open an issue or submit a pull request.

# Clone the repo
git clone https://github.com/DevDarsh26/quira.git
cd quira

# Create a virtual environment
python -m venv .venv
.venv\Scripts\activate   # Windows
source .venv/bin/activate  # macOS/Linux

# Install in editable mode with dev dependencies
pip install -e ".[dev]"

Built with ❤️ by darshmodii.in

_{If you like Quira, drop a ⭐ on GitHub — it means the world!}

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

darshmodii

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.2.0

Jun 16, 2026

This version

0.1.0

Jun 16, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

quira-0.1.0.tar.gz (19.0 kB view details)

Uploaded Jun 16, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

quira-0.1.0-py3-none-any.whl (18.0 kB view details)

Uploaded Jun 16, 2026 Python 3

File details

Details for the file quira-0.1.0.tar.gz.

File metadata

Download URL: quira-0.1.0.tar.gz
Upload date: Jun 16, 2026
Size: 19.0 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for quira-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`c24309fbfdcaf8fe32f91fafb60c5a56a5a0959cda4f0dddc697b39cf298dd85`
MD5	`7ec483aa543891339bfde2c481efc64d`
BLAKE2b-256	`eaebfb0c005c447eb759f24b222fb0bcb103273184c0cce98301d9d7e3f6ce98`

See more details on using hashes here.

Provenance

The following attestation bundles were made for quira-0.1.0.tar.gz:

Publisher: publish.yml on DevDarsh26/Quira

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: quira-0.1.0.tar.gz
- Subject digest: c24309fbfdcaf8fe32f91fafb60c5a56a5a0959cda4f0dddc697b39cf298dd85
- Sigstore transparency entry: 1838028800
- Sigstore integration time: Jun 16, 2026
Source repository:
- Permalink: DevDarsh26/Quira@30b8c8608ada7e3aca33562a36ae1bd6b7678786
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/DevDarsh26
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@30b8c8608ada7e3aca33562a36ae1bd6b7678786
- Trigger Event: push

File details

Details for the file quira-0.1.0-py3-none-any.whl.

File metadata

Download URL: quira-0.1.0-py3-none-any.whl
Upload date: Jun 16, 2026
Size: 18.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for quira-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`1bb78165062bc963c5f2cdb93f48e9d8b6e796bd366d4737592cb99f03086639`
MD5	`96336654a9658ac1b78255da2fb6ca30`
BLAKE2b-256	`13821772e34414555d391b29fae80f4839e592ceec029ddcaa9efbc8185ed7b7`

See more details on using hashes here.

Provenance

The following attestation bundles were made for quira-0.1.0-py3-none-any.whl:

Publisher: publish.yml on DevDarsh26/Quira

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: quira-0.1.0-py3-none-any.whl
- Subject digest: 1bb78165062bc963c5f2cdb93f48e9d8b6e796bd366d4737592cb99f03086639
- Sigstore transparency entry: 1838028974
- Sigstore integration time: Jun 16, 2026
Source repository:
- Permalink: DevDarsh26/Quira@30b8c8608ada7e3aca33562a36ae1bd6b7678786
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/DevDarsh26
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@30b8c8608ada7e3aca33562a36ae1bd6b7678786
- Trigger Event: push

quira 0.1.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

Quira

🔥 The Problem

✨ The Quira Solution

📦 Quickstart

Install

Usage

Ingest PDFs

⚙️ How It Works

🏎️ Module 1 — Speculative Retrieval

🧩 Module 2 — Context Tetris

🔄 Module 3 — Differential Retrieval

📄 Module 4 — Document Ingestion

Architecture

📊 Benchmarks

📚 API Reference

quiraPipeline(qdrant, redis, groq, embed_func, spacy_model)

UserSession(user_id, websocket=None)

🔒 Security

🤝 Contributing

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

`quiraPipeline(qdrant, redis, groq, embed_func, spacy_model)`

`UserSession(user_id, websocket=None)`