Skip to main content

Python interface to the Anserini IR toolkit built on Lucene

Project description

Pyserini provides a simple Python interface to the Anserini IR toolkit via pyjnius.

Installation

Install via PyPI

pip install pyserini

Usage

Here's a sample pre-built index on TREC Disks 4 & 5 to play with (used in the TREC 2004 Robust Track):

wget https://git.uwaterloo.ca/jimmylin/anserini-indexes/raw/master/index-robust04-20191213.tar.gz
tar xvfz index-robust04-20191213.tar.gz

Use the SimpleSearcher for searching:

from pyserini.search import pysearch

searcher = pysearch.SimpleSearcher('index-robust04-20191213/')
hits = searcher.search('hubble space telescope')

# Print the first 10 hits:
for i in range(0, 10):
    print(f'{i+1:2} {hits[i].docid:15} {hits[i].score:.5f}')

# Grab the actual text:
hits[0].raw

For additional information, please refer to the Pyserini repository.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyserini-0.9.0.0.tar.gz (57.7 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pyserini-0.9.0.0-py3-none-any.whl (57.7 MB view details)

Uploaded Python 3

File details

Details for the file pyserini-0.9.0.0.tar.gz.

File metadata

  • Download URL: pyserini-0.9.0.0.tar.gz
  • Upload date:
  • Size: 57.7 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.6.0 requests-toolbelt/0.9.1 tqdm/4.37.0 CPython/3.6.7

File hashes

Hashes for pyserini-0.9.0.0.tar.gz
Algorithm Hash digest
SHA256 c91f5d1f6cf914a6ba62af37f7d1485208f0c5fe32b76a1cbe0d253ad51eb639
MD5 caa65d34f0ca70fe416b5aca4736a83e
BLAKE2b-256 26999520dc01803387e1a364f06315c708ea3b485409d7a55fdf92ed2276578d

See more details on using hashes here.

File details

Details for the file pyserini-0.9.0.0-py3-none-any.whl.

File metadata

  • Download URL: pyserini-0.9.0.0-py3-none-any.whl
  • Upload date:
  • Size: 57.7 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.6.0 requests-toolbelt/0.9.1 tqdm/4.37.0 CPython/3.6.7

File hashes

Hashes for pyserini-0.9.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 deadb7ce25841f4a99f1752abd73b273004f6ca169ab94abdeb0dfcfcd49e750
MD5 cc25a9f7c15ac02ec9a2b56516d93315
BLAKE2b-256 1967f11bd9c9afcef667b816864a38cea950d1245fc56e4617714530ba4fdccc

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page