Skip to main content

Python interface to the Anserini IR toolkit built on Lucene

Project description

Pyserini provides a simple Python interface to the Anserini IR toolkit via pyjnius.

Installation

Install via PyPI

pip install pyserini

Usage

Here's a sample pre-built index on TREC Disks 4 & 5 to play with (used in the TREC 2004 Robust Track):

wget https://git.uwaterloo.ca/jimmylin/anserini-indexes/raw/master/index-robust04-20191213.tar.gz
tar xvfz index-robust04-20191213.tar.gz

Use the SimpleSearcher for searching:

from pyserini.search import pysearch

searcher = pysearch.SimpleSearcher('index-robust04-20191213/')
hits = searcher.search('hubble space telescope')

# Print the first 10 hits:
for i in range(0, 10):
    print(f'{i+1:2} {hits[i].docid:15} {hits[i].score:.5f}')

# Grab the actual text:
hits[0].raw

For additional information, please refer to the Pyserini repository.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyserini-0.9.2.0.tar.gz (57.8 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pyserini-0.9.2.0-py3-none-any.whl (57.8 MB view details)

Uploaded Python 3

File details

Details for the file pyserini-0.9.2.0.tar.gz.

File metadata

  • Download URL: pyserini-0.9.2.0.tar.gz
  • Upload date:
  • Size: 57.8 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.6.0 requests-toolbelt/0.9.1 tqdm/4.37.0 CPython/3.6.7

File hashes

Hashes for pyserini-0.9.2.0.tar.gz
Algorithm Hash digest
SHA256 6bb4e22d7cb0a83a6a8f73559afe3530cc925ea75c42c821f0c6116db24cbd1b
MD5 1ee5f2cdcf29d516453fff5b7fca8dbf
BLAKE2b-256 b0c196c9c9bf0cd8c95a4c3c540a6eb93a4a04adb3361e4a626819aed9f81d55

See more details on using hashes here.

File details

Details for the file pyserini-0.9.2.0-py3-none-any.whl.

File metadata

  • Download URL: pyserini-0.9.2.0-py3-none-any.whl
  • Upload date:
  • Size: 57.8 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.6.0 requests-toolbelt/0.9.1 tqdm/4.37.0 CPython/3.6.7

File hashes

Hashes for pyserini-0.9.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ca49b732d2116c39a36bc8effc86a6d8639d43b303f48a3e45cb5a7c6f90fd04
MD5 cfe9dc9a9a3b0cb28b53d013e2778dff
BLAKE2b-256 e43859314dc075292dc5ca4bc2435ae1937316cad5a0bc9c4f9af1d0704eb65a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page