symsearch is a search engine for research and development that uses a searching and ranking pipeline in Dense Passage Retrieval.
Project description
SYMSEARCH
symsearch is a search engine for research and development that uses a searching and ranking pipeline in Dense Passage Retrieval.
Installation
symsearch recommends Python 3.8 or higher version.
Install with pip
pip install symsearch
Training
Updating . . .
Inference
Retrieval
- Query embedding
from symsearch import RetrieveInference
sentence = "Why does water heated to room temperature feel colder than the air around it?"
retrieve = RetrieveInference(
model_name_or_path="caskcsg/cotmae_base_msmarco_retriever",
q_max_length=128,
device_type='cpu'
)
query_embd = retrieval.encode_question(sentence)
print(query_embd)
- Passage embedding
from symsearch import RetrieveInference
sentence = "Water transfers heat more efficiently than air. When something feels cold it's " \
"because heat is being transferred from your skin to whatever you're touching. " \
"Since water absorbs the heat more readily than air, it feels colder."
retrieve = RetrieveInference(
model_name_or_path="caskcsg/cotmae_base_msmarco_retriever",
p_max_length=384,
device_type='cpu'
)
passage_embd = retrieval.encode_context(sentence)
print(passage_embd)
Reranker
from symsearch import RerankInference
sentence1 = "If I hypothetically built a fully functioning rocket and were able to "\
"fund the trip myself, would it be legal for me to leave earth?"
sentence2 = "crew, who have become the first humans to travel into space. The rocket is at first thought to be lost, " \
"having dramatically overshot its planned orbit, but eventually it is detected by radar and returns to Earth, " \
"crash-landing in Wimbledon, London.\nWhen Quatermass and his team reach the crash area and succeed in opening " \
"the rocket, they discover that only one of the three crewmen, Victor Carroon, remains inside."
rerank = RerankInference(
model_name_or_path="caskcsg/cotmae_base_msmarco_reranker",
q_max_length=128,
p_max_length=384,
device_type='cpu'
)
score = rerank.encode_pair(query=sentence1, passage=sentence2)[0]
print(score)
Contacts
If you have any questions/suggestions feel free to open an issue or send general ideas through email.
- Contact person: tien.ngnvan@gmail.com
This repository contains experimental research and developments purpose of giving additional background details on Dense Passage Retrieval.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
symsearch-0.0.2.tar.gz
(13.2 kB
view hashes)
Built Distribution
symsearch-0.0.2-py3-none-any.whl
(14.4 kB
view hashes)
Close
Hashes for symsearch-0.0.2-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1d8543e991054ae6e123f2fb7fbc74fb7266490bf70e96fdb37f88a1dea33aa8 |
|
MD5 | c8846916a7665ec0b02763fb08227e89 |
|
BLAKE2b-256 | 8ba5e39d1afc196137009269f130ecb8f099f8d6609afab317fe9f36c2613d83 |