A conversational passage retrieval toolkit
Project description
Chatty Goose
Multi-stage Conversational Passage Retrieval: An Approach to Fusing Term Importance Estimation and Neural Query Rewriting
Installation
-
Make sure Java 11+ and Python 3.7+ are installed
-
Install the
chatty-goose
PyPI module
pip install chatty-goose
-
If you are using T5 or BERT, make sure to install PyTorch 1.4.0 - 1.7.1 using your specific platform instructions. Note that PyTorch 1.8 is currently incompatible due to the
transformers
version we currently use. Also make sure to install the corresponding torchtext version. -
Download the English model for spaCy
python -m spacy download en_core_web_sm
Quickstart Guide
The following example shows how to initialize a searcher and build a ConversationalQueryRewriter
agent from scratch using HQE and T5 as first-stage retrievers, and a BERT reranker. To see a working example agent, see chatty_goose/agents/chat.py.
First, load a searcher
from pyserini.search import SimpleSearcher
# Option 1: load a prebuilt index
searcher = SimpleSearcher.from_prebuilt_index("INDEX_NAME_HERE")
# Option 2: load a local Lucene index
searcher = SimpleSearcher("PATH_TO_INDEX")
searcher.set_bm25(0.82, 0.68)
Next, initialize one or more first-stage CQR retrievers
from chatty_goose.cqr import Hqe, Ntr
from chatty_goose.settings import HqeSettings, NtrSettings
hqe = Hqe(searcher, HqeSettings())
ntr = Ntr(NtrSettings())
Load a reranker
from chatty_goose.util import build_bert_reranker
reranker = build_bert_reranker()
Create a new RetrievalPipeline
from chatty_goose.pipeline import RetrievalPipeline
rp = RetrievalPipeline(searcher, [hqe, ntr], searcher_num_hits=50, reranker=reranker)
And we're done! Simply call rp.retrieve(query)
to retrieve passages, or call rp.reset_history()
to reset the conversational history of the retrievers.
Running Experiments
-
Clone the repo and all submodules (
git submodule update --init --recursive
) -
Clone and build Anserini for evaluation tools
-
Install dependencies
pip install -r requirements.txt
- Follow the instructions under docs/cqr_experiments.md to run experiments using HQE, T5, or fusion.
Example Agent
To run an interactive conversational search agent with ParlAI, simply run chat.py
. By default, we use the CAsT 2019 pre-built Pyserini index, but it is possible to specify other indexes using the --from_prebuilt
flag. See the file for other possible arguments:
python -m chatty_goose.agents.chat
Alternatively, run the agent using ParlAI's command line interface:
python -m parlai interactive --model chatty_goose.agents.chat:ChattyGooseAgent
We also provide instructions to deploy the agent to Facebook Messenger using ParlAI under examples/messenger
.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file chatty-goose-0.2.0.tar.gz
.
File metadata
- Download URL: chatty-goose-0.2.0.tar.gz
- Upload date:
- Size: 14.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/51.3.3 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.7.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | bee18d2572431f58fae3d8184db20cdcf0d2fafc5b3de06c944f2895c33e2d92 |
|
MD5 | 66ed4e7cad97115a1f3f3792643654a6 |
|
BLAKE2b-256 | b922be80d6b2986414a3ea74b3b3e0402890a4df9a2a06cd9b95bae991782e75 |
File details
Details for the file chatty_goose-0.2.0-py3-none-any.whl
.
File metadata
- Download URL: chatty_goose-0.2.0-py3-none-any.whl
- Upload date:
- Size: 16.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/51.3.3 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.7.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 43c4af080823d1a34f53d153fbeefcdb8e894d7fb3c24999f401234f240764b1 |
|
MD5 | 38d1bb7a7efe67d145b9c989dacb1dca |
|
BLAKE2b-256 | e46c7f2d591ce97c5e6417f8a60a496da3973593a906a43d8f319d3fb9d5de29 |