Python interface to the Anserini IR toolkit built on Lucene
Project description
Pyserini provides a simple Python interface to the Anserini IR toolkit via pyjnius.
Installation
Install via PyPI:
pip install pyserini
Usage
As a quick start, use the SimpleSearcher
for searching, with a pre-built index on TREC Disks 4 & 5 (used in the TREC 2004 Robust Track):
from pyserini.search import SimpleSearcher
searcher = SimpleSearcher.from_prebuilt_index('robust04')
hits = searcher.search('hubble space telescope')
# Print the first 10 hits:
for i in range(0, 10):
print(f'{i+1:2} {hits[i].docid:15} {hits[i].score:.5f}')
# Grab the actual text:
hits[0].raw
For additional information, please refer to the Pyserini repository.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
pyserini-0.10.1.0.tar.gz
(63.3 MB
view hashes)
Built Distribution
Close
Hashes for pyserini-0.10.1.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 92df86ae25d15d6abcd1cceb1f019525ac5a71d8ae8be8be601d187c2b00b0bc |
|
MD5 | 59a1d5bf9d55a7c55c44b63bf3765088 |
|
BLAKE2b-256 | edea0f5f9f8cb7d0df47314aa6c94f6a30756bb08fbcd4c06b3da2910d0a0ff5 |