Skip to main content

Unified Hyperbolic Spectral Retrieval (UHSR) - a novel text retrieval algorithm combining lexical and semantic search.

Project description

Unified Hyperbolic Spectral Retrieval (UHSR)

Unified Hyperbolic Spectral Retrieval (UHSR) is an advanced hybrid text retrieval model that seamlessly integrates lexical search (BM25) with semantic search (FAISS/Pinecone) while employing spectral re-ranking for interpretable and normalized relevance scores in the [0,1] range.

🚀 Key Features

  • 🔍 Hybrid Retrieval: Combines BM25 for lexical scoring and dense vector semantic similarity for contextual understanding.
  • 🎯 Multi-Metric Similarity: Supports cosine, euclidean, mahalanobis, manhattan, chebyshev, jaccard, and hamming similarity.
  • 🔬 Spectral Re-Ranking: Uses graph Laplacian & Fiedler vector to boost highly relevant candidates.
  • ⚡ AI-powered Reranking: Supports Hugging Face Cross-Encoders & OpenAI API-based Reranking.
  • 📈 Interpretable Scores: Final relevance scores are logistic-normalized in [0,1] for easy ranking.
  • 🚀 Scalable & Efficient: Works with FAISS (local) for fast retrieval and Pinecone (cloud-based) for large-scale vector search.

🛠️ How It Works

UHSR enhances traditional retrieval by blending BM25-based keyword matching with semantic vector representations using the following pipeline:

Step Description
1️⃣ Lexical Filtering Uses BM25 to rank documents by keyword relevance
2️⃣ Semantic Scoring Computes similarity using FAISS or Pinecone
3️⃣ Fusion Process Blends scores via logistic normalization & harmonic fusion
4️⃣ Spectral Re-Ranking Uses graph Laplacian analysis to boost central candidates
5️⃣ (Optional) AI Reranking Uses OpenAI API or Hugging Face Cross-Encoders

🌍 Supported Retrieval Methods

  • BM25 (Lexical Matching)
  • FAISS (Local Vector Search)
  • Pinecone (Cloud Vector Search)
  • Hugging Face Rerankers
  • OpenAI API-based Reranking

📌 Why UHSR?

  • Better Search Results: Combines exact keyword matching (BM25) with contextual embeddings (Semantic Search).
  • Faster & Scalable: Uses FAISS for local retrieval or Pinecone for cloud-based vector search.
  • Interpretable Ranking: Outputs normalized scores in [0,1], making it easy to interpret.
  • Multi-Metric Similarity: Supports cosine, euclidean, mahalanobis, manhattan, chebyshev, jaccard, and hamming.

🎯 Intended Use

UHSR is designed for:

  • Information Retrieval Research
  • Search Engines & Recommendation Systems
  • NLP Applications in AI & Machine Learning
  • Academic & Industry-scale Document Ranking

📂 Code & Documentation

For complete documentation, usage examples, and implementation details, visit the GitHub repository.


🔥 Try UHSR today and revolutionize your search engine! 🚀

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

uhsr-0.2.6.tar.gz (13.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

uhsr-0.2.6-py3-none-any.whl (12.4 kB view details)

Uploaded Python 3

File details

Details for the file uhsr-0.2.6.tar.gz.

File metadata

  • Download URL: uhsr-0.2.6.tar.gz
  • Upload date:
  • Size: 13.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.4

File hashes

Hashes for uhsr-0.2.6.tar.gz
Algorithm Hash digest
SHA256 b036d842ed37edb9cd8f4ba1aa0b588108f69c3091614931ddd981ae89e4cf9d
MD5 40081be4b51b3756d587acda204e5a5c
BLAKE2b-256 e25ce13c4cc2ce6367c9d253f38b75c60f7968a74d6e16ae9111b2a86aceecee

See more details on using hashes here.

File details

Details for the file uhsr-0.2.6-py3-none-any.whl.

File metadata

  • Download URL: uhsr-0.2.6-py3-none-any.whl
  • Upload date:
  • Size: 12.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.4

File hashes

Hashes for uhsr-0.2.6-py3-none-any.whl
Algorithm Hash digest
SHA256 ed6a979ee08129cc7c72fa850512f3e8c115ade1e39b7eee6cfc050bf4861f34
MD5 8c2bf4d57d89f339a62a9914f2deb6da
BLAKE2b-256 c86a48b6f4406af0626d353d43b5896dddaa99ce4b5f30ff7b267ab6609215e4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page