Skip to main content

Python interface to the Anserini IR toolkit built on Lucene

Project description

Pyserini provides a simple Python interface to the Anserini IR toolkit via pyjnius.

Installation

Install via PyPI

pip install pyserini

Usage

Here's a sample pre-built index on TREC Disks 4 & 5 to play with (used in the TREC 2004 Robust Track):

wget https://git.uwaterloo.ca/jimmylin/anserini-indexes/raw/master/index-robust04-20191213.tar.gz
tar xvfz index-robust04-20191213.tar.gz

Use the SimpleSearcher for searching:

from pyserini.search import pysearch

searcher = pysearch.SimpleSearcher('index-robust04-20191213/')
hits = searcher.search('hubble space telescope')

# Prints the first 10 hits
for i in range(0, 10):
    print('{} {} {}'.format(i+1, hits[i].docid, hits[i].score))

# Grab the actual text
hits[0].content

For additional information, please refer to the Pyserini repository.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyserini-0.7.2.0.tar.gz (53.7 MB view details)

Uploaded Source

Built Distribution

pyserini-0.7.2.0-py3-none-any.whl (53.7 MB view details)

Uploaded Python 3

File details

Details for the file pyserini-0.7.2.0.tar.gz.

File metadata

  • Download URL: pyserini-0.7.2.0.tar.gz
  • Upload date:
  • Size: 53.7 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.6.0 requests-toolbelt/0.9.1 tqdm/4.37.0 CPython/3.6.7

File hashes

Hashes for pyserini-0.7.2.0.tar.gz
Algorithm Hash digest
SHA256 bbb8e3889fe192bcf5433035eafdfbbb9017eebd7f4eb20ed0d56875d29598ab
MD5 ae47910caa0f85a5379ffb1657ac1194
BLAKE2b-256 805107936d50d879c9fd78188865c63da7dc1bfa9bafdaa73896d4e36b9cc36d

See more details on using hashes here.

Provenance

File details

Details for the file pyserini-0.7.2.0-py3-none-any.whl.

File metadata

  • Download URL: pyserini-0.7.2.0-py3-none-any.whl
  • Upload date:
  • Size: 53.7 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.6.0 requests-toolbelt/0.9.1 tqdm/4.37.0 CPython/3.6.7

File hashes

Hashes for pyserini-0.7.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 53b505dc2ec51221c9f91f78289c90ad24278ed1fcaa7eddbe9ada446bb42579
MD5 1c681e7cbe1acf2fbef76e695e2a11b7
BLAKE2b-256 a93c6402b3aceb2a8c69b2409e5d66f39fcabc99a12cbb5b75400778d41f35e3

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page