Chat with your PDFs using local embedding search and OpenAI.
Project description
📄 Chat with PDF
Chat with your PDF documents easily using local embeddings and powerful LLMs like OpenAI's GPT models.
Chat with your PDF documents easily using local embeddings and powerful LLMs like OpenAI's GPT models.
Upload any PDF and ask natural language questions about its content — powered by semantic search and AI.
🛠️ Installation
pip install chat-with-pdf
Or using Poetry:
poetry add chat-with-pdf
✨ Quickstart Example
from chat_with_pdf import PDFChat
chat = PDFChat('path/to/your/document.pdf')
response = chat.ask("Summarize the introduction section.")
print(response)
You can pass a file path, URL, or binary bytes of the PDF to PDFChat.
Example:
chat = PDFChat("path/to/file.pdf")
chat = PDFChat("https://example.com/file.pdf")
chat = PDFChat(binary_pdf_data)
⚙️ Configuration Options
You can configure your usage via arguments, environment variables, or let it fallback to defaults.
Priority:
- Arguments passed to
PDFChat - Environment Variables
- Library defaults
Supported Environment Variables:
| Variable | Purpose | Default |
|---|---|---|
OPENAI_API_KEY |
Your OpenAI API key | "" (empty) |
OPENAI_MODEL |
GPT model name to use | "gpt-3.5-turbo" |
EMBEDDING_MODEL |
Embedding model for vector search | "all-MiniLM-L6-v2" |
DEFAULT_CHUNK_SIZE |
Number of characters per text chunk | 500 |
TOP_K_RETRIEVAL |
Number of similar chunks to retrieve per question | 5 |
Example .env file:
OPENAI_API_KEY=sk-xxxxx
OPENAI_MODEL=gpt-4
DEFAULT_CHUNK_SIZE=600
TOP_K_RETRIEVAL=8
EMBEDDING_MODEL=all-mpnet-base-v2
If you have a .env file at your project root, chat-with-pdf will automatically load it.
🔥 Advanced Usage Example
Explicitly passing all settings:
from chat_with_pdf import PDFChat
chat = PDFChat(
'path/to/your/document.pdf',
openai_api_key="sk-your-openai-key",
model="gpt-4",
embedding_model="all-mpnet-base-v2",
chunk_size=600,
top_k_retrieval=8
)
response = chat.ask("Summarize the key points.")
print(response)
📝 License
This project is licensed under the MIT License.
🌟 Acknowledgements
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file chat_with_pdf-0.3.2.tar.gz.
File metadata
- Download URL: chat_with_pdf-0.3.2.tar.gz
- Upload date:
- Size: 5.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c4db10c3a58669736bc69c26e7b91b5ef9c41bd6ff320570c50a98cd9130c469
|
|
| MD5 |
cc774e031ebfc742461ab1087ed3b351
|
|
| BLAKE2b-256 |
b084a58b19670babcca4aeef1a1014f173892194bcfb59e869f2489608ddbc92
|
File details
Details for the file chat_with_pdf-0.3.2-py3-none-any.whl.
File metadata
- Download URL: chat_with_pdf-0.3.2-py3-none-any.whl
- Upload date:
- Size: 7.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b05399259b19c2607af36571d867809e0f35e9a00195685614dfb4abf6857b10
|
|
| MD5 |
06514d552473f13ca5d47060b488e9b9
|
|
| BLAKE2b-256 |
88bf9afa8c249e780ac88919ff76abc42078172bdb38df227e7c2748f4e34799
|