Skip to main content

This is a temporary project while I wait for my langchain [pull-request](https://github.com/langchain-ai/langchain/pull/7278) to be validated.

Project description

We believe that hallucinations pose a major problem in the adoption of LLMs (Language Model Models). It is imperative to provide a simple and quick solution that allows the user to verify the coherence of the answers to the questions they are asked.

The conventional approach is to provide a list of URLs of the documents that helped in answering (see qa_with_source). However, this approach is unsatisfactory in several scenarios:

  1. The question is asked about a PDF of over 100 pages. Each fragment comes from the same document, but from where?
  2. Some documents do not have URLs (data retrieved from a database or other loaders).

It appears essential to have a means of retrieving all references to the actual data sources used by the model to answer the question.

This includes:

  • The precise list of documents used for the answer (the Documents, along with their metadata that may contain page numbers, slide numbers, or any other information allowing the retrieval of the fragment in the original document).
  • The excerpts of text used for the answer in each fragment. Even if a fragment is used, the LLM only utilizes a small portion to generate the answer. Access to these verbatim excerpts helps to quickly ascertain the validity of the answer.

We propose a two pipelines: qa_with_reference and qa_with_reference_and_verbatims for this purpose. It is a Question/Answer type pipeline that returns the list of documents used, and in the metadata, the list of verbatim excerpts exploited to produce the answer.

If the verbatim is not really from the original document, it's removed.

Install

pip install langchain-qa_with_reference

Sample notebook

See here

langchain Pull-request

This is a temporary project while I wait for my langchain pull-request to be validated.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

langchain_qa_with_references-0.0.281.tar.gz (18.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

File details

Details for the file langchain_qa_with_references-0.0.281.tar.gz.

File metadata

File hashes

Hashes for langchain_qa_with_references-0.0.281.tar.gz
Algorithm Hash digest
SHA256 9dc88ece7c458f27d0313702cb90cdbc05f3f6008790686eeebf800e2e80485a
MD5 693cc296a81f5d9c1e5a183f2a0a6258
BLAKE2b-256 5b6ee4a6ac4632183a92c3881923af12374cc01de4af7fdf3e8c65a44a91bd8d

See more details on using hashes here.

File details

Details for the file langchain_qa_with_references-0.0.281-py3-none-any.whl.

File metadata

File hashes

Hashes for langchain_qa_with_references-0.0.281-py3-none-any.whl
Algorithm Hash digest
SHA256 833aa1a7c496219969e461b0c48fac844c9d847a127d88d3f8f2d279c0d6a3dd
MD5 623f340c62e3ad7a412759295a863e2f
BLAKE2b-256 b9506bf3d51975f25fd32583ec77fcee0438aca9de95a2d9a89864d404bf9fad

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page