Skip to main content

symsearch is a search engine for research and development that uses a searching and ranking pipeline in Dense Passage Retrieval.

Project description

SYMSEARCH

symsearch is a search engine for research and development that uses a searching and ranking pipeline in Dense Passage Retrieval.

Installation

symsearch recommends Python 3.8 or higher version.

Install with pip

pip install symsearch

Training

Updating . . .

Inference

Retrieval

  1. Query embedding
from symsearch import RetrieveInference

sentence = "Why does water heated to room temperature feel colder than the air around it?"

retrieve = RetrieveInference(
        model_name_or_path="caskcsg/cotmae_base_msmarco_retriever",
        q_max_length=128,
        device_type='cpu'
      )
query_embd = retrieval.encode_question(sentence)
print(query_embd)
  1. Passage embedding
from symsearch import RetrieveInference

sentence = "Water transfers heat more efficiently than air. When something feels cold it's " \
          "because heat is being transferred from your skin to whatever you're touching. " \
          "Since water absorbs the heat more readily than air, it feels colder."

retrieve = RetrieveInference(
        model_name_or_path="caskcsg/cotmae_base_msmarco_retriever",
        p_max_length=384,
        device_type='cpu'
      )
passage_embd = retrieval.encode_context(sentence)
print(passage_embd)

Reranker

from symsearch import RerankInference

sentence1 = "If I hypothetically built a fully functioning rocket and were able to "\
            "fund the trip myself, would it be legal for me to leave earth?"
sentence2 = "crew, who have become the first humans to travel into space. The rocket is at first thought to be lost, " \
            "having dramatically overshot its planned orbit, but eventually it is detected by radar and returns to Earth, " \
            "crash-landing in Wimbledon, London.\nWhen Quatermass and his team reach the crash area and succeed in opening " \
            "the rocket, they discover that only one of the three crewmen, Victor Carroon, remains inside."

rerank = RerankInference(
        model_name_or_path="caskcsg/cotmae_base_msmarco_reranker",
        q_max_length=128,
        p_max_length=384,
        device_type='cpu'
      )
score = rerank.encode_pair(query=sentence1, passage=sentence2)[0]
print(score)

Contacts

If you have any questions/suggestions feel free to open an issue or send general ideas through email.

This repository contains experimental research and developments purpose of giving additional background details on Dense Passage Retrieval.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

symsearch-0.0.2.tar.gz (13.2 kB view hashes)

Uploaded Source

Built Distribution

symsearch-0.0.2-py3-none-any.whl (14.4 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page