Skip to main content

Unified Hyperbolic Spectral Retrieval (UHSR) - a novel text retrieval algorithm combining lexical and semantic search.

Project description

Unified Hyperbolic Spectral Retrieval (UHSR)

Unified Hyperbolic Spectral Retrieval (UHSR) is an advanced hybrid text retrieval model that seamlessly integrates lexical search (BM25) with semantic search (FAISS/Pinecone) while employing spectral re-ranking for interpretable and normalized relevance scores in the [0,1] range.

🚀 Key Features

  • 🔍 Hybrid Retrieval: Combines BM25 for lexical scoring and dense vector semantic similarity for contextual understanding.
  • 🎯 Multi-Metric Similarity: Supports cosine, euclidean, mahalanobis, manhattan, chebyshev, jaccard, and hamming similarity.
  • 🔬 Spectral Re-Ranking: Uses graph Laplacian & Fiedler vector to boost highly relevant candidates.
  • ⚡ AI-powered Reranking: Supports Hugging Face Cross-Encoders & OpenAI API-based Reranking.
  • 📈 Interpretable Scores: Final relevance scores are logistic-normalized in [0,1] for easy ranking.
  • 🚀 Scalable & Efficient: Works with FAISS (local) for fast retrieval and Pinecone (cloud-based) for large-scale vector search.

🛠️ How It Works

UHSR enhances traditional retrieval by blending BM25-based keyword matching with semantic vector representations using the following pipeline:

Step Description
1️⃣ Lexical Filtering Uses BM25 to rank documents by keyword relevance
2️⃣ Semantic Scoring Computes similarity using FAISS or Pinecone
3️⃣ Fusion Process Blends scores via logistic normalization & harmonic fusion
4️⃣ Spectral Re-Ranking Uses graph Laplacian analysis to boost central candidates
5️⃣ (Optional) AI Reranking Uses OpenAI API or Hugging Face Cross-Encoders

🌍 Supported Retrieval Methods

  • BM25 (Lexical Matching)
  • FAISS (Local Vector Search)
  • Pinecone (Cloud Vector Search)
  • Hugging Face Rerankers
  • OpenAI API-based Reranking

📌 Why UHSR?

  • Better Search Results: Combines exact keyword matching (BM25) with contextual embeddings (Semantic Search).
  • Faster & Scalable: Uses FAISS for local retrieval or Pinecone for cloud-based vector search.
  • Interpretable Ranking: Outputs normalized scores in [0,1], making it easy to interpret.
  • Multi-Metric Similarity: Supports cosine, euclidean, mahalanobis, manhattan, chebyshev, jaccard, and hamming.

🎯 Intended Use

UHSR is designed for:

  • Information Retrieval Research
  • Search Engines & Recommendation Systems
  • NLP Applications in AI & Machine Learning
  • Academic & Industry-scale Document Ranking

📂 Code & Documentation

For complete documentation, usage examples, and implementation details, visit the GitHub repository.


🔥 Try UHSR today and revolutionize your search engine! 🚀

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

uhsr-0.2.7.tar.gz (13.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

uhsr-0.2.7-py3-none-any.whl (12.4 kB view details)

Uploaded Python 3

File details

Details for the file uhsr-0.2.7.tar.gz.

File metadata

  • Download URL: uhsr-0.2.7.tar.gz
  • Upload date:
  • Size: 13.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.4

File hashes

Hashes for uhsr-0.2.7.tar.gz
Algorithm Hash digest
SHA256 1239e2fa3568232483c035793ce62824598877308c996e177573cbfd15698c2d
MD5 9efc48a65bc364c04d708627f461a778
BLAKE2b-256 48a68c6adf594c38d9cabcf081b6050fac4e4e103f6195c3d4f43f55475d8813

See more details on using hashes here.

File details

Details for the file uhsr-0.2.7-py3-none-any.whl.

File metadata

  • Download URL: uhsr-0.2.7-py3-none-any.whl
  • Upload date:
  • Size: 12.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.4

File hashes

Hashes for uhsr-0.2.7-py3-none-any.whl
Algorithm Hash digest
SHA256 39ebd8ee096f1c8c1f5915d3c99655f07c4810391e072d80f0458f1ab331370a
MD5 0916504d81374375b294411e13b4fc18
BLAKE2b-256 6023a9602fc6ed6adc93237bd11217a69bdf1be7dc979554b5e41b5c8f4ae30c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page