Python interface to the Anserini IR toolkit built on Lucene
Project description
Pyserini provides a simple Python interface to the Anserini IR toolkit via pyjnius.
Installation
Install via PyPI
pip install pyserini
Usage
Here's a sample pre-built index on TREC Disks 4 & 5 to play with (used in the TREC 2004 Robust Track):
wget https://git.uwaterloo.ca/jimmylin/anserini-indexes/raw/master/index-robust04-20191213.tar.gz
tar xvfz index-robust04-20191213.tar.gz
Use the SimpleSearcher
for searching:
from pyserini.search import pysearch
searcher = pysearch.SimpleSearcher('index-robust04-20191213/')
hits = searcher.search('hubble space telescope')
# Prints the first 10 hits
for i in range(0, 10):
print('{} {} {}'.format(i+1, hits[i].docid, hits[i].score))
# Grab the actual text
hits[0].content
For additional information, please refer to the Pyserini repository.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
pyserini-0.7.2.0.tar.gz
(53.7 MB
view details)
Built Distribution
File details
Details for the file pyserini-0.7.2.0.tar.gz
.
File metadata
- Download URL: pyserini-0.7.2.0.tar.gz
- Upload date:
- Size: 53.7 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.6.0 requests-toolbelt/0.9.1 tqdm/4.37.0 CPython/3.6.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | bbb8e3889fe192bcf5433035eafdfbbb9017eebd7f4eb20ed0d56875d29598ab |
|
MD5 | ae47910caa0f85a5379ffb1657ac1194 |
|
BLAKE2b-256 | 805107936d50d879c9fd78188865c63da7dc1bfa9bafdaa73896d4e36b9cc36d |
Provenance
File details
Details for the file pyserini-0.7.2.0-py3-none-any.whl
.
File metadata
- Download URL: pyserini-0.7.2.0-py3-none-any.whl
- Upload date:
- Size: 53.7 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.6.0 requests-toolbelt/0.9.1 tqdm/4.37.0 CPython/3.6.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 53b505dc2ec51221c9f91f78289c90ad24278ed1fcaa7eddbe9ada446bb42579 |
|
MD5 | 1c681e7cbe1acf2fbef76e695e2a11b7 |
|
BLAKE2b-256 | a93c6402b3aceb2a8c69b2409e5d66f39fcabc99a12cbb5b75400778d41f35e3 |