An easy-to-use Elasticsearch BM25 interface
Project description
Easy Elasticsearch
This repository contains a high-level encapsulation for using Elasticsearch with python in just a few lines.
Installation
Via pip:
pip install easy-elasticsearch
Via git repo:
git clone https://github.com/kwang2049/easy-elasticsearch
pip install -e .
To get the backend server program as the very last step, one also needs to download official Elasticsearch: (please find the suitable version for your OS if not using Linux x86/64)
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.12.1-linux-x86_64.tar.gz
tar -xf elasticsearch-7.12.1-linux-x86_64.tar.gz
Usage
Just first create an ElasticSearchBM25 object while indicating the text pool to be indexed and your local path to elasticsearch-xx.xx.xx/bin; then either call its rank
or score
function for retrieval or calculating BM25 scores.
from easy_elasticsearch import ElasticSearchBM25
pool = {
'id1': 'What is Python? Is it a programming language',
'id2': 'Which Python version is the best?',
'id3': 'Using easy-elasticsearch in Python is really convenient!'
}
bm25 = ElasticSearchBM25(pool, 'elasticsearch-7.12.1/bin') # remember to use your local path of elasticsearh/bin
query = "What is Python?"
rank = bm25.query(query, topk=10) # topk should be <= 10000
scores = bm25.score(query, document_ids=['id2', 'id3'])
print(query, rank, scores)
Another example for retrieving Quora questions can be found in example/quora.py.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for easy_elasticsearch-0.0.2-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b019fba8d5125a28e80ebb5d6b65ef5b22bac1ed0eaf470f0a3223dc00af6a3b |
|
MD5 | 028fc45f9855beccafffb01dbaa01cf5 |
|
BLAKE2b-256 | 97090197a4287b2627e4bd81754812657c0074a2e2c0491936a8efe542f339dc |