A persistent implementation of BM25 retrieval.
Project description
bm25_retriever
bm25_retriever is a persistent BM25 retriever for use with LangChain, built on top of rank_bm25.
Features
- Save and load BM25 retriever to/from disk
- Easily integrate with LangChain
Documentformat - Simple API for creating, persisting, and querying retrievers
Installation
pip install bm25_retriever
Usage
from langchain_core.documents import Document
from bm25_retriever.retriever import PersistentBM25Retriever
# Sample documents
docs = [
Document(page_content="The quick brown fox jumps over the lazy dog"),
Document(page_content="A fox fled from danger"),
Document(page_content="Dogs are loyal companions"),
Document(page_content="Foxes are cunning and agile"),
]
# Step 1: Create a retriever with a specified save directory
save_directory = "bm25_storage"
retriever = PersistentBM25Retriever(documents=docs, save_dir=save_directory, persist=True)
# Step 2: Persist the retriever to the specified directory
# retriever.persist()
# Step 3: Load the retriever from the directory with a custom k value
loaded_retriever = PersistentBM25Retriever.from_persist_dir(save_dir=save_directory, k=3)
# Step 4: Use the loaded retriever to retrieve documents
query = "fox"
results = loaded_retriever.get_relevant_documents(query)
# Print the retrieved documents
print(f"Retrieved {len(results)} documents for query '{query}':")
for i, doc in enumerate(results, 1):
print(f"{i}. {doc.page_content}")
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
bm25_retriever-0.3.tar.gz
(2.7 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file bm25_retriever-0.3.tar.gz.
File metadata
- Download URL: bm25_retriever-0.3.tar.gz
- Upload date:
- Size: 2.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3d3a45f74b3a79e287bb7ebf6d4ed22626ad79f432c2478656d5e42ae32ab8ea
|
|
| MD5 |
9c2193d3343a2f9a5022835ad62b78dc
|
|
| BLAKE2b-256 |
50ddf6cd246dd44931b2dc8787a7ec45a32c6b10bf795a41095d69c20796edb2
|
File details
Details for the file bm25_retriever-0.3-py3-none-any.whl.
File metadata
- Download URL: bm25_retriever-0.3-py3-none-any.whl
- Upload date:
- Size: 3.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a057534856fa4e856d46c9ffc8997e49732d7d46021f5bf3b034b5cbe26dc2e1
|
|
| MD5 |
e6793b1f36bc575b259bf69df0c1fcb9
|
|
| BLAKE2b-256 |
39e81c8d48015b7e757c618cdfe6f3739b82f1b26e70d45fd701564a0262c7c6
|