Semantic search to query covid related papers
Semantic search with FAISS
The idea of this project is to build a semantic search engine which can search across multiple research papers related to covid and return the response. This can pretty much help people who want to know about ongoing research with respect to covid'19
I have used -
retrieval-ranking method with faiss index approach for the faster retrieval of data for the given query.
pip install semantic-search-faiss
from semanticsearch import search,utils,config from semanticsearch.pretrained import get_model from sentence_transformers import CrossEncoder bi_encoder,index,documents=get_model(config.BI_ENCODER,config.INDEX,config.DATA) cross_encoder = CrossEncoder(config.CROSS_ENCODER) query='death rates of covid' results=search.search(query,index,bi_encoder,cross_encoder,documents)
1. Synthetic query generation using T5 2. Finetuning Bi-encoder using the synthetic query 3. Indexing the data with FAISS using finetuned BI-encoder 4. Bi-encoder + Cross encoder with FAISS search
Try out the code on google colab.
Detailed walk through of the solution can be found in the below kaggle notebook
I would like to thank Kaggle community as a whole for providing an avenue to learn and discuss latest data science/machine learning advancements.
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Hashes for semantic_search_faiss-0.1.0.tar.gz
Hashes for semantic_search_faiss-0.1.0-py3-none-any.whl