Skip to main content

CLI app to quickly chat with your PDFs locally

Project description

pdfchat

CLI app to quickly chat with your PDFs locally

PyPI version License GitHub Page


Overview

pdfchat is a Python-based CLI app that utilizes open-source LLMs using Ollama to quickly open a chat session with a PDF file. The app emphasizes speed as you can start chatting directly from the command line without needing to open a GUI or web interface and security as it runs locally on your machine and your data is not sent to any external servers.

Features

  • Parse and extract content from PDF files.
  • Chat with the content of the PDF using a conversational interface.
  • Select specific pages of a PDF for focused interaction.
  • Supports multiple language models via the Ollama platform.

Use Cases

  • Students: Quickly find information in textbooks or lecture notes. Have a tutor teach you content specific to your study material.
  • Researchers: Extract data from research papers or articles. Ask questions about specific sections of a paper.
  • Educators: Create interactive learning materials. Use the app to generate quizzes and questions based on the content of a PDF.
  • Business Professionals: Review contracts, reports, or any other documents. Get clarifications on specific sections.
  • General Users: Quickly find information in any PDF document. Reduce hallucinations by grounding the model with the content of the PDF.

Installation

Prerequisites

pdfchat assumes you have Ollama installed. If you haven't installed it yet, follow the instructions on the Ollama website. Make sure you have the required models installed. You can check the available models on the Ollama website.

You also need to have marker-pdf installed. You can install it using pip:

pip install marker-pdf

You can also install marker-pdf using pipx.

Install pdfchat

To install pdfchat using pipx, run:

pipx install pdfchat

Usage

Basic Command

To start a chat session with a PDF file, run:

pdfchat <path_to_pdf>

Note: If it is your first time running the app, it will take a few seconds (depending on your internet speed) to download PDF parsing and OCR models.

Options

  • --model or -m: Ollama model to use (must be installed). Defaults to the first model returned by ollama list. ( ex: llama3.1:8b)
  • --url or -u: Ollama base URL. Defaults to 'http://localhost:11434'
  • --pages or -p: The pdf pages to parse. Defaults to all pages. Ex: 1,2,3 or 1-3 or 1-3,5-7. Defaults to all pages.

Help

To see all available options, run:

pdfchat --help

This will display the help message with all available options and their descriptions.

Example

pdfchat example.pdf --model llama3.1:8b

Example

pdfchat example.pdf --model llama3.1:8b --pages 1-5

License

This project is licensed under the MIT License. See the LICENSE file for details.

Author

Developed by Ibrahim Habib. You can contact me through LinkedIn or Email.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pdfchat-0.1.1.tar.gz (140.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pdfchat-0.1.1-py3-none-any.whl (6.4 kB view details)

Uploaded Python 3

File details

Details for the file pdfchat-0.1.1.tar.gz.

File metadata

  • Download URL: pdfchat-0.1.1.tar.gz
  • Upload date:
  • Size: 140.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.6.6

File hashes

Hashes for pdfchat-0.1.1.tar.gz
Algorithm Hash digest
SHA256 0fbf0e577bd3c825aee3b47a15615cc369fc88860921ff0be7f43eec7d35a6c0
MD5 ba3dee0a16f9655efd6af67c5762cb5d
BLAKE2b-256 41a303a928f0b0443b9427da94cd492bf64902c06aa29aab7a9181456f1e2729

See more details on using hashes here.

File details

Details for the file pdfchat-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: pdfchat-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 6.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.6.6

File hashes

Hashes for pdfchat-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 5585cefdf7b65e8c7b8182591189d9551a6ac5fe8213311708644f627cc1be3c
MD5 35800996efe700c731b5b2c19aa42984
BLAKE2b-256 23599e0c9b8c1b1f376f70de2efe7918e36eadd8f881b845312fb16a1f55f473

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page