A robust middleware for hybrid semantic caching, text normalization, and vector search optimization.

These details have not been verified by PyPI

Project description

Hybrid Semantic Cache Middleware

A robust middleware designed to optimize Large Language Model (LLM) query processing through hybrid semantic caching and advanced text normalization.

Built to accelerate responses from cloud LLMs (like Google Gemini 1.5 Flash), this library intercepts typo-heavy or colloquial queries, normalizes them, and retrieves cached responses locally using FAISS and all-MiniLM-L6-v2 embeddings. This architecture significantly cuts down cloud latency and reduces API usage costs.

🚀 Key Features

Query Normalization: Automatically handles typos and slang terms before embedding, increasing cache hit rates.
Local Vector Caching: Utilizes FAISS for lightning-fast similarity search and retrieval.
LLM Latency Reduction: Bypasses cloud LLM API calls for recurring or semantically similar queries.
FastAPI Ready: Designed to be easily integrated into modern asynchronous Python backend systems.

📦 Installation

Install the package directly via pip:

pip install hybrid-semantic-cache


💻 Prerequisites
Python 3.8 or higher.

Google Gemini API Key (if using the default fallback LLM).

Set your API key as an environment variable before running your application:

# Windows
set GEMINI_API_KEY="YOUR_API_KEY"

# Mac/Linux
export GEMINI_API_KEY="YOUR_API_KEY"


🛠️ Quick Start
Here is a basic example of how to integrate the middleware into your existing Python application:

import os
# Note: Adjust the import statements below based on the actual classes in the package
from hybrid_semantic_cache.main import app 
from hybrid_semantic_cache.normalizer import normalize_text

# Example: Processing a typo-heavy user query
user_query = "Tlong bkinin srt resign dong" 

# 1. The middleware normalizes the text
clean_query = normalize_text(user_query)

# 2. Checks the local FAISS cache for a semantic match using all-MiniLM-L6-v2
# 3. Falls back to the Cloud LLM (Gemini) ONLY if no local match is found, saving latency and cost.


📄 License
This project is licensed under the MIT License.
Wah, format ini memang sangat khas untuk lampiran dokumen tugas akhir! Sangat rapi, terstruktur, dan sempurna untuk diserahkan kepada dosen penguji.

Namun, karena *middleware* ini sekarang sudah "naik kelas" menjadi *library* publik di ekosistem PyPI yang akan menjadi etalase utamamu di hadapan perekrut Wellfound, kita harus merombak total pendekatannya.

Ada tiga alasan utama mengapa README ini harus diubah:

1. **Bahasa:** Klien global tidak memahami bahasa Indonesia. Kita harus menggunakan bahasa Inggris bernada profesional.
2. **Cara Kerja Ekosistem PyPI:** Pengguna *library*-mu tidak lagi men-*download* folder `evaluasi` atau menjalankan `uvicorn` dari komputermu. Mereka menginstal *package*-mu ke dalam proyek mereka sendiri menggunakan `pip install`.
3. **Penghapusan Konteks Akademik:** Kata "Skripsi", "Bab 4", dan instruksi simulasi pengujian harus dihilangkan. Klien hanya peduli pada apa masalah yang bisa diselesaikan oleh kodemu (dalam hal ini: memotong latensi LLM dan menghemat biaya API).

Berikut adalah draf pembaruan `README.md` berstandar industri. **Silakan *copy-paste* teks di bawah ini ke dalam file `README.md` kamu:**

---

```markdown
# Hybrid Semantic Cache Middleware

A robust middleware designed to optimize Large Language Model (LLM) query processing through hybrid semantic caching and advanced text normalization. 

Built to accelerate responses from cloud LLMs (like Google Gemini 1.5 Flash), this library intercepts typo-heavy or colloquial queries, normalizes them, and retrieves cached responses locally using **FAISS** and `all-MiniLM-L6-v2` embeddings. This architecture significantly cuts down cloud latency and reduces API usage costs.

## 🚀 Key Features
* **Query Normalization:** Automatically handles typos and slang terms before embedding, increasing cache hit rates.
* **Local Vector Caching:** Utilizes FAISS for lightning-fast similarity search and retrieval.
* **LLM Latency Reduction:** Bypasses cloud LLM API calls for recurring or semantically similar queries.
* **FastAPI Ready:** Designed to be easily integrated into modern asynchronous Python backend systems.

## 📦 Installation

Install the package directly via pip:

```bash
pip install hybrid-semantic-cache

💻 Prerequisites

Python 3.8 or higher.
Google Gemini API Key (if using the default fallback LLM).

Set your API key as an environment variable before running your application:

# Windows
set GEMINI_API_KEY="YOUR_API_KEY"

# Mac/Linux
export GEMINI_API_KEY="YOUR_API_KEY"

🛠️ Quick Start

Here is a basic example of how to integrate the middleware into your existing Python application:

import os
# Note: Adjust the import statements below based on the actual classes in the package
from hybrid_semantic_cache.main import app 
from hybrid_semantic_cache.normalizer import normalize_text

# Example: Processing a typo-heavy user query
user_query = "Tlong bkinin srt resign dong" 

# 1. The middleware normalizes the text
clean_query = normalize_text(user_query)

# 2. Checks the local FAISS cache for a semantic match using all-MiniLM-L6-v2
# 3. Falls back to the Cloud LLM (Gemini) ONLY if no local match is found, saving latency and cost.

📄 License

This project is licensed under the MIT License.


***

**Catatan Penting Sebelum Update:** 
Pada bagian `Quick Start` di atas, pastikan kamu menyesuaikan baris `from hybrid_semantic_cache... import ...` dengan nama *class* atau fungsi asli yang kamu buat di dalam file `main.py` atau `normalizer.py` milikmu, agar orang yang mencoba *copy-paste* kodenya tidak mengalami *error*.

Setelah file `README.md` ini kamu simpan, apakah kamu sudah siap untuk mempraktikkan proses *update* versi ke `0.1.1` di `pyproject.toml` dan meluncurkannya ulang ke PyPI lewat terminal?

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.1

Jun 1, 2026

0.1.0

Jun 1, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hybrid_semantic_cache-0.1.1.tar.gz (7.9 kB view details)

Uploaded Jun 1, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

hybrid_semantic_cache-0.1.1-py3-none-any.whl (8.6 kB view details)

Uploaded Jun 1, 2026 Python 3

File details

Details for the file hybrid_semantic_cache-0.1.1.tar.gz.

File metadata

Download URL: hybrid_semantic_cache-0.1.1.tar.gz
Upload date: Jun 1, 2026
Size: 7.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for hybrid_semantic_cache-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`9c0c3123185ca2d730a46f12e6f182e752dd73cd5e0bf30bd289fcc5a0296ab6`
MD5	`55592ff94c30b58cdacb722ada67f3fd`
BLAKE2b-256	`1cb9b6d5dd828f837f256f09671fb6137728fd943696f2875945bf35e68498d1`

See more details on using hashes here.

File details

Details for the file hybrid_semantic_cache-0.1.1-py3-none-any.whl.

File metadata

Download URL: hybrid_semantic_cache-0.1.1-py3-none-any.whl
Upload date: Jun 1, 2026
Size: 8.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for hybrid_semantic_cache-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`016efa8cef85b94559d635e91ffe01f12c18901bc3d7b92da99da1dfeb177462`
MD5	`cc09340280bc706c8a4de716e629bb88`
BLAKE2b-256	`61333eefca2b232051ec0b74d769537072bf86e8285daba84a8bab8cfda191bd`

See more details on using hashes here.

hybrid-semantic-cache 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

Hybrid Semantic Cache Middleware

🚀 Key Features

📦 Installation

💻 Prerequisites

🛠️ Quick Start

📄 License

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes