Semantic search to query covid related papers
Project description
Semantic search with FAISS
The idea of this project is to build a semantic search engine which can search across multiple research papers related to covid and return the response. This can pretty much help people who want to know about ongoing research with respect to covid'19
We have used - retrieval-ranking method with faiss index
for retrieving data for the query.
Installation
pip install semantic-search-faiss
Inference example
from semanticsearch import search,utils,config
from semanticsearch.pretrained import get_model
from sentence_transformers import CrossEncoder
bi_encoder,index,documents=get_model(config.BI_ENCODER,config.INDEX,config.DATA)
cross_encoder = CrossEncoder(config.CROSS_ENCODER)
query='death rates of covid'
results=search.search(query,index,bi_encoder,cross_encoder,documents)
Training pipeline
Synthetic query generation using T5
Finetuning Bi-encoder using the synthetic query
Index the data using finetuned BI-encoder
Bi-encoder + Cross encoder with FAISS search
Try out the code either on google colab.
Kaggle
Detailed walk through of the solution can be found in the below kaggle notebook
Acknowledgements
We would like to thank Kaggle community as a whole for providing an avenue to learn and discuss latest data science/machine learning advancements but a hat tip to whose code was used / who inspired us.
-
Vladimir Iglovikov for his wonderful article "I trained a model. What is next?"
-
Xhululu for the dataset.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file semantic_search_faiss-0.0.9.tar.gz
.
File metadata
- Download URL: semantic_search_faiss-0.0.9.tar.gz
- Upload date:
- Size: 6.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/4.5.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.0 CPython/3.6.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 95991b9ae09e4dd55c3331cb9aefad1954247a2e4f6e557f90af25b3ecf6c00a |
|
MD5 | d4d92e1d1f07f6fe3e120dcb43bedcf6 |
|
BLAKE2b-256 | 7b426d738ff0ad2f1be3a76025f52cee98bf6a7b81749a9cbcae5b04c9db47fa |
File details
Details for the file semantic_search_faiss-0.0.9-py3-none-any.whl
.
File metadata
- Download URL: semantic_search_faiss-0.0.9-py3-none-any.whl
- Upload date:
- Size: 6.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/4.5.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.0 CPython/3.6.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 87819aeca4b67d164163fb426101818facbc447a286a37f1226e12775ce7cb46 |
|
MD5 | 6e287fa23c7dc9fa59e93549ce8aea6e |
|
BLAKE2b-256 | 1da06a43868e62f6b63a57d1c36e328aed4706ea3c631c6bf655ad09ef6ae2c8 |