Skip to main content

Use the ChatNoir search engine in PyTerrier.

Project description

PyPi CI Code coverage Python Google Colab Issues Commit activity Downloads License

🔍 chatnoir-pyterrier

Use the ChatNoir REST-API in PyTerrier for retrieval/re-ranking against large corpora such as ClueWeb09, ClueWeb12, ClueWeb22, or MS MARCO.

Powered by the chatnoir-api package.

Installation

Install the package from PyPI:

pip install chatnoir-pyterrier

Usage

You can use the ChatNoirRetrieve PyTerrier module in any PyTerrier pipeline, like you would do with BatchRetrieve.

from chatnoir_pyterrier import ChatNoirRetrieve, Feature

chatnoir = ChatNoirRetrieve(index="msmarco-document-v2.1", features=Feature.SNIPPET_TEXT)
chatnoir.search("python library")

Features

ChatNoir provides an extensive set of extra features, such as the full text or page rank / spam rank (for some indices). These can easily be included in the response data frame for usage in subsequent PyTerrier re-ranking stages like so:

from chatnoir_pyterrier import ChatNoirRetrieve, Feature

chatnoir_msmarco_snippet = ChatNoirRetrieve(index="msmarco-document-v2.1", features=Feature.SNIPPET_TEXT)
chatnoir_msmarco_snippet.search("python library")

chatnoir_cw09_page_spam_rank = ChatNoirRetrieve(index="clueweb09", features=Feature.PAGE_RANK | Feature.SPAM_RANK)
chatnoir_cw09_page_spam_rank.search("python library")

Advanced usage

Please check out our sample notebook or open it in Google Colab.

We also provide a hands-on guide for the Touché 2023 shared tasks here.

Development

To build this package and contribute to its development you need to install the build, and setuptools and wheel packages:

pip install build setuptools wheel

(On most systems, these packages are already pre-installed.)

Development installation

Install package and test dependencies:

pip install -e .[test]

Testing

Configure the API keys for testing:

export CHATNOIR_API_KEY="<API_KEY>"

Verify your changes against the test suite to verify.

ruff check .                   # Code format and LINT
mypy .                         # Static typing
bandit -c pyproject.toml -r .  # Security
pytest .                       # Unit tests

Please also add tests for your newly developed code.

Build wheels

Wheels for this package can be built with:

python -m build

Support

If you hit any problems using this package, please file an issue. We're happy to help!

License

This repository is released under the MIT license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chatnoir_pyterrier-3.0.2.tar.gz (26.2 kB view details)

Uploaded Source

Built Distribution

chatnoir_pyterrier-3.0.2-py3-none-any.whl (22.9 kB view details)

Uploaded Python 3

File details

Details for the file chatnoir_pyterrier-3.0.2.tar.gz.

File metadata

  • Download URL: chatnoir_pyterrier-3.0.2.tar.gz
  • Upload date:
  • Size: 26.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for chatnoir_pyterrier-3.0.2.tar.gz
Algorithm Hash digest
SHA256 4c8f0bf24e11c5664aad13b2b1a342e352fb2b5f56a6111e12e2f91908b9637f
MD5 dace8e644adba290f733f760457821f4
BLAKE2b-256 6070a4c328cb36cc1f6133917df7ae615c39a31808380d5906b916ef2e2338fb

See more details on using hashes here.

File details

Details for the file chatnoir_pyterrier-3.0.2-py3-none-any.whl.

File metadata

File hashes

Hashes for chatnoir_pyterrier-3.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 5c158ad13a612e60edabf69597934d1b4ae0a7e3696919ff0c7044602bafac73
MD5 2a1376096e223299693740e0b50f2cdc
BLAKE2b-256 199f8e1ef59a3a37086a00c532505ff96e131959ee927e63754a66f8ed105716

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page