Skip to main content

A conversational passage retrieval toolkit

Project description

Chatty Goose

Multi-stage Conversational Passage Retrieval: An Approach to Fusing Term Importance Estimation and Neural Query Rewriting


PyPI LICENSE

Installation

  1. Make sure Java 11+ and Python 3.7+ are installed

  2. Install the chatty-goose PyPI module

pip install chatty-goose
  1. If you are using T5 or BERT, make sure to install PyTorch 1.4.0 - 1.7.1 using your specific platform instructions. Note that PyTorch 1.8 is currently incompatible due to the transformers version we currently use. Also make sure to install the corresponding torchtext version.

  2. Download the English model for spaCy

python -m spacy download en_core_web_sm

Quickstart Guide

The following example shows how to initialize a searcher and build a ConversationalQueryRewriter agent from scratch using HQE and T5 as first-stage retrievers, and a BERT reranker. To see a working example agent, see chatty_goose/agents/chat.py.

First, load a searcher

from pyserini.search import SimpleSearcher

# Option 1: load a prebuilt index
searcher = SimpleSearcher.from_prebuilt_index("INDEX_NAME_HERE")
# Option 2: load a local Lucene index
searcher = SimpleSearcher("PATH_TO_INDEX")

searcher.set_bm25(0.82, 0.68)

Next, initialize one or more first-stage CQR retrievers

from chatty_goose.cqr import Hqe, Ntr
from chatty_goose.settings import HqeSettings, NtrSettings

hqe = Hqe(searcher, HqeSettings())
ntr = Ntr(NtrSettings())

Load a reranker

from chatty_goose.util import build_bert_reranker

reranker = build_bert_reranker()

Create a new RetrievalPipeline

from chatty_goose.pipeline import RetrievalPipeline

rp = RetrievalPipeline(searcher, [hqe, ntr], searcher_num_hits=50, reranker=reranker)

And we're done! Simply call rp.retrieve(query) to retrieve passages, or call rp.reset_history() to reset the conversational history of the retrievers.

Running Experiments

  1. Clone the repo and all submodules (git submodule update --init --recursive)

  2. Clone and build Anserini for evaluation tools

  3. Install dependencies

pip install -r requirements.txt
  1. Follow the instructions under docs/cqr_experiments.md to run experiments using HQE, T5, or fusion.

Example Agent

To run an interactive conversational search agent with ParlAI, simply run chat.py. By default, we use the CAsT 2019 pre-built Pyserini index, but it is possible to specify other indexes using the --from_prebuilt flag. See the file for other possible arguments:

python -m chatty_goose.agents.chat

Alternatively, run the agent using ParlAI's command line interface:

python -m parlai interactive --model chatty_goose.agents.chat:ChattyGooseAgent

We also provide instructions to deploy the agent to Facebook Messenger using ParlAI under examples/messenger.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chatty-goose-0.2.0.tar.gz (14.4 kB view details)

Uploaded Source

Built Distribution

chatty_goose-0.2.0-py3-none-any.whl (16.3 kB view details)

Uploaded Python 3

File details

Details for the file chatty-goose-0.2.0.tar.gz.

File metadata

  • Download URL: chatty-goose-0.2.0.tar.gz
  • Upload date:
  • Size: 14.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/51.3.3 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.7.9

File hashes

Hashes for chatty-goose-0.2.0.tar.gz
Algorithm Hash digest
SHA256 bee18d2572431f58fae3d8184db20cdcf0d2fafc5b3de06c944f2895c33e2d92
MD5 66ed4e7cad97115a1f3f3792643654a6
BLAKE2b-256 b922be80d6b2986414a3ea74b3b3e0402890a4df9a2a06cd9b95bae991782e75

See more details on using hashes here.

File details

Details for the file chatty_goose-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: chatty_goose-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 16.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/51.3.3 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.7.9

File hashes

Hashes for chatty_goose-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 43c4af080823d1a34f53d153fbeefcdb8e894d7fb3c24999f401234f240764b1
MD5 38d1bb7a7efe67d145b9c989dacb1dca
BLAKE2b-256 e46c7f2d591ce97c5e6417f8a60a496da3973593a906a43d8f319d3fb9d5de29

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page