Skip to main content

symsearch is a search engine for research and development that uses a searching and ranking pipeline in Dense Passage Retrieval.

Project description

SYMSEARCH

symsearch is a search engine for research and development that uses a searching and ranking pipeline in Dense Passage Retrieval.

Installation

symsearch recommends Python 3.8 or higher version.

Install with pip

pip install symsearch

Training

Updating . . .

Inference

Retrieval

  1. Query embedding
from symsearch import RetrieveInference

sentence = "Why does water heated to room temperature feel colder than the air around it?"

retrieve = RetrieveInference(
        model_name_or_path="caskcsg/cotmae_base_msmarco_retriever",
        q_max_length=128,
        device_type='cpu'
      )
query_embd = retrieval.encode_question(sentence)
print(query_embd)
  1. Passage embedding
from symsearch import RetrieveInference

sentence = "Water transfers heat more efficiently than air. When something feels cold it's " \
          "because heat is being transferred from your skin to whatever you're touching. " \
          "Since water absorbs the heat more readily than air, it feels colder."

retrieve = RetrieveInference(
        model_name_or_path="caskcsg/cotmae_base_msmarco_retriever",
        p_max_length=384,
        device_type='cpu'
      )
passage_embd = retrieval.encode_context(sentence)
print(passage_embd)

Reranker

from symsearch import RerankInference

sentence1 = "If I hypothetically built a fully functioning rocket and were able to "\
            "fund the trip myself, would it be legal for me to leave earth?"
sentence2 = "crew, who have become the first humans to travel into space. The rocket is at first thought to be lost, " \
            "having dramatically overshot its planned orbit, but eventually it is detected by radar and returns to Earth, " \
            "crash-landing in Wimbledon, London.\nWhen Quatermass and his team reach the crash area and succeed in opening " \
            "the rocket, they discover that only one of the three crewmen, Victor Carroon, remains inside."

rerank = RerankInference(
        model_name_or_path="caskcsg/cotmae_base_msmarco_reranker",
        q_max_length=128,
        p_max_length=384,
        device_type='cpu'
      )
score = rerank.encode_pair(query=sentence1, passage=sentence2)[0]
print(score)

Contacts

If you have any questions/suggestions feel free to open an issue or send general ideas through email.

This repository contains experimental research and developments purpose of giving additional background details on Dense Passage Retrieval.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

symsearch-0.0.2.tar.gz (13.2 kB view details)

Uploaded Source

Built Distribution

symsearch-0.0.2-py3-none-any.whl (14.4 kB view details)

Uploaded Python 3

File details

Details for the file symsearch-0.0.2.tar.gz.

File metadata

  • Download URL: symsearch-0.0.2.tar.gz
  • Upload date:
  • Size: 13.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.0

File hashes

Hashes for symsearch-0.0.2.tar.gz
Algorithm Hash digest
SHA256 446b9df398cd347f4fadd20327d4587888057b33684f0d7a23353f80d8210294
MD5 a12a126168ca13bcbb3aca2ce05c9f36
BLAKE2b-256 94d91895a4df7de98c16f515bbf5e202d5c4a0f1bf8a655fc6628a1082ad5abf

See more details on using hashes here.

File details

Details for the file symsearch-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: symsearch-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 14.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.0

File hashes

Hashes for symsearch-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 1d8543e991054ae6e123f2fb7fbc74fb7266490bf70e96fdb37f88a1dea33aa8
MD5 c8846916a7665ec0b02763fb08227e89
BLAKE2b-256 8ba5e39d1afc196137009269f130ecb8f099f8d6609afab317fe9f36c2613d83

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page