Skip to main content

Unified Hyperbolic Spectral Retrieval (UHSR) - a novel text retrieval algorithm combining lexical and semantic search.

Project description

Unified Hyperbolic Spectral Retrieval (UHSR)

Unified Hyperbolic Spectral Retrieval (UHSR) is an advanced hybrid text retrieval model that seamlessly integrates lexical search (BM25) with semantic search (FAISS/Pinecone) while employing spectral re-ranking for interpretable and normalized relevance scores in the [0,1] range.

🚀 Key Features

  • 🔍 Hybrid Retrieval: Combines BM25 for lexical scoring and dense vector semantic similarity for contextual understanding.
  • 🎯 Multi-Metric Similarity: Supports cosine, euclidean, mahalanobis, manhattan, chebyshev, jaccard, and hamming similarity.
  • 🔬 Spectral Re-Ranking: Uses graph Laplacian & Fiedler vector to boost highly relevant candidates.
  • ⚡ AI-powered Reranking: Supports Hugging Face Cross-Encoders & OpenAI API-based Reranking.
  • 📈 Interpretable Scores: Final relevance scores are logistic-normalized in [0,1] for easy ranking.
  • 🚀 Scalable & Efficient: Works with FAISS (local) for fast retrieval and Pinecone (cloud-based) for large-scale vector search.

🛠️ How It Works

UHSR enhances traditional retrieval by blending BM25-based keyword matching with semantic vector representations using the following pipeline:

Step Description
1️⃣ Lexical Filtering Uses BM25 to rank documents by keyword relevance
2️⃣ Semantic Scoring Computes similarity using FAISS or Pinecone
3️⃣ Fusion Process Blends scores via logistic normalization & harmonic fusion
4️⃣ Spectral Re-Ranking Uses graph Laplacian analysis to boost central candidates
5️⃣ (Optional) AI Reranking Uses OpenAI API or Hugging Face Cross-Encoders

🌍 Supported Retrieval Methods

  • BM25 (Lexical Matching)
  • FAISS (Local Vector Search)
  • Pinecone (Cloud Vector Search)
  • Hugging Face Rerankers
  • OpenAI API-based Reranking

📌 Why UHSR?

  • Better Search Results: Combines exact keyword matching (BM25) with contextual embeddings (Semantic Search).
  • Faster & Scalable: Uses FAISS for local retrieval or Pinecone for cloud-based vector search.
  • Interpretable Ranking: Outputs normalized scores in [0,1], making it easy to interpret.
  • Multi-Metric Similarity: Supports cosine, euclidean, mahalanobis, manhattan, chebyshev, jaccard, and hamming.

🎯 Intended Use

UHSR is designed for:

  • Information Retrieval Research
  • Search Engines & Recommendation Systems
  • NLP Applications in AI & Machine Learning
  • Academic & Industry-scale Document Ranking

📂 Code & Documentation

For complete documentation, usage examples, and implementation details, visit the GitHub repository.

Learn More about this package on Medium.


🔥 Try UHSR today and revolutionize your search engine! 🚀

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

uhsr-0.2.8.tar.gz (14.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

uhsr-0.2.8-py3-none-any.whl (12.5 kB view details)

Uploaded Python 3

File details

Details for the file uhsr-0.2.8.tar.gz.

File metadata

  • Download URL: uhsr-0.2.8.tar.gz
  • Upload date:
  • Size: 14.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.4

File hashes

Hashes for uhsr-0.2.8.tar.gz
Algorithm Hash digest
SHA256 aa86e239d48196db4e1e8fc539956574bc4f658abd93caed5a071b45df9c788d
MD5 4022ac36c90d30d8e2d1bb1ceb0b4be3
BLAKE2b-256 1844b8ce9ba8a8ee6fbb9c54e35327ddb2d10886356589d3857ddf6d537ef3ef

See more details on using hashes here.

File details

Details for the file uhsr-0.2.8-py3-none-any.whl.

File metadata

  • Download URL: uhsr-0.2.8-py3-none-any.whl
  • Upload date:
  • Size: 12.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.4

File hashes

Hashes for uhsr-0.2.8-py3-none-any.whl
Algorithm Hash digest
SHA256 3eb68e81c6295dac80951dca9c1e55a15e12375aa729b03c7506235b828b4a2d
MD5 d02b1798b9d55a66ddd732081eb09de6
BLAKE2b-256 bfbac779915b6ec33ffb441f3fca2bdb56350933038b7d46cc25a4d224ee2adc

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page