symsearch is a search engine for research and development that uses a searching and ranking pipeline in Dense Passage Retrieval.
Project description
SYMSEARCH
symsearch is a search engine for research and development that uses a searching and ranking pipeline in Dense Passage Retrieval.
Installation
symsearch recommends Python 3.8 or higher version.
Install with pip
pip install symsearch
Training
Updating . . .
Inference
Retrieval
- Query embedding
from symsearch import RetrieveInference
sentence = "Why does water heated to room temperature feel colder than the air around it?"
retrieve = RetrieveInference(
model_name_or_path="caskcsg/cotmae_base_msmarco_retriever",
q_max_length=128,
device_type='cpu'
)
query_embd = retrieval.encode_question(sentence)
print(query_embd)
- Passage embedding
from symsearch import RetrieveInference
sentence = "Water transfers heat more efficiently than air. When something feels cold it's " \
"because heat is being transferred from your skin to whatever you're touching. " \
"Since water absorbs the heat more readily than air, it feels colder."
retrieve = RetrieveInference(
model_name_or_path="caskcsg/cotmae_base_msmarco_retriever",
p_max_length=384,
device_type='cpu'
)
passage_embd = retrieval.encode_context(sentence)
print(passage_embd)
Reranker
from symsearch import RerankInference
sentence1 = "If I hypothetically built a fully functioning rocket and were able to "\
"fund the trip myself, would it be legal for me to leave earth?"
sentence2 = "crew, who have become the first humans to travel into space. The rocket is at first thought to be lost, " \
"having dramatically overshot its planned orbit, but eventually it is detected by radar and returns to Earth, " \
"crash-landing in Wimbledon, London.\nWhen Quatermass and his team reach the crash area and succeed in opening " \
"the rocket, they discover that only one of the three crewmen, Victor Carroon, remains inside."
rerank = RerankInference(
model_name_or_path="caskcsg/cotmae_base_msmarco_reranker",
q_max_length=128,
p_max_length=384,
device_type='cpu'
)
score = rerank.encode_pair(query=sentence1, passage=sentence2)[0]
print(score)
Contacts
If you have any questions/suggestions feel free to open an issue or send general ideas through email.
- Contact person: tien.ngnvan@gmail.com
This repository contains experimental research and developments purpose of giving additional background details on Dense Passage Retrieval.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file symsearch-0.0.2.tar.gz
.
File metadata
- Download URL: symsearch-0.0.2.tar.gz
- Upload date:
- Size: 13.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 446b9df398cd347f4fadd20327d4587888057b33684f0d7a23353f80d8210294 |
|
MD5 | a12a126168ca13bcbb3aca2ce05c9f36 |
|
BLAKE2b-256 | 94d91895a4df7de98c16f515bbf5e202d5c4a0f1bf8a655fc6628a1082ad5abf |
File details
Details for the file symsearch-0.0.2-py3-none-any.whl
.
File metadata
- Download URL: symsearch-0.0.2-py3-none-any.whl
- Upload date:
- Size: 14.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1d8543e991054ae6e123f2fb7fbc74fb7266490bf70e96fdb37f88a1dea33aa8 |
|
MD5 | c8846916a7665ec0b02763fb08227e89 |
|
BLAKE2b-256 | 8ba5e39d1afc196137009269f130ecb8f099f8d6609afab317fe9f36c2613d83 |