Skip to main content

Python interface to the Anserini IR toolkit built on Lucene

Project description

Pyserini provides a simple Python interface to the Anserini IR toolkit via pyjnius.

Installation

Install via PyPI

pip install pyserini

Usage

Here's a sample pre-built index on TREC Disks 4 & 5 to play with (used in the TREC 2004 Robust Track):

wget https://git.uwaterloo.ca/jimmylin/anserini-indexes/raw/master/index-robust04-20191213.tar.gz
tar xvfz index-robust04-20191213.tar.gz

Use the SimpleSearcher for searching:

from pyserini.search import pysearch

searcher = pysearch.SimpleSearcher('index-robust04-20191213/')
hits = searcher.search('hubble space telescope')

# Print the first 10 hits:
for i in range(0, 10):
    print(f'{i+1} {hits[i].docid} {hits[i].score}')

# Grab the actual text:
hits[0].content

For additional information, please refer to the Pyserini repository.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyserini-0.8.0.0.tar.gz (55.5 MB view details)

Uploaded Source

Built Distribution

pyserini-0.8.0.0-py3-none-any.whl (55.5 MB view details)

Uploaded Python 3

File details

Details for the file pyserini-0.8.0.0.tar.gz.

File metadata

  • Download URL: pyserini-0.8.0.0.tar.gz
  • Upload date:
  • Size: 55.5 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.6.0 requests-toolbelt/0.9.1 tqdm/4.37.0 CPython/3.6.7

File hashes

Hashes for pyserini-0.8.0.0.tar.gz
Algorithm Hash digest
SHA256 207e7309fce59a578636611c2eb2d09aa44c1190b14c879dbfd1a4e6474d29b6
MD5 30fa0320a0352e89f6a179ea721ad857
BLAKE2b-256 5f73098dd0b93e14ede5b09db134c0db5e63a712da1cc92ad13e0854184f3089

See more details on using hashes here.

Provenance

File details

Details for the file pyserini-0.8.0.0-py3-none-any.whl.

File metadata

  • Download URL: pyserini-0.8.0.0-py3-none-any.whl
  • Upload date:
  • Size: 55.5 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.6.0 requests-toolbelt/0.9.1 tqdm/4.37.0 CPython/3.6.7

File hashes

Hashes for pyserini-0.8.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 8f8c6aba1b2a1eb3d6b15998a748668d0cc0b17ee75dfd2e014285775777512d
MD5 56c44fa4129f5f38447daa372ab1360b
BLAKE2b-256 bc0acfec98a2555b2a5d0de406758b036118b39362b140c30477c851647ebda6

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page