PDF Quiz Generator
Project description
PDFxact
A powerful PDF Quiz Generator that extracts content from PDFs and automatically generates educational questions with fact checking capabilities.
Installation
pip install pdfxact
Features
- PDF text extraction with OCR support
- AI-powered question generation from PDF content
- Automated fact checking and validation
- Text preprocessing and quality analysis
- Batch processing for efficient question generation
- Customizable number of questions
Usage
Command Line Interface
# Generate 5 questions from a PDF file
qz path/to/your/document.pdf
# Generate a specific number of questions
qz path/to/your/document.pdf --num_questions 10
# Control batch processing size
qz path/to/your/document.pdf --num_questions 20 --batch_size 8
Python API
from qz.text_extractor import TextExtractor
from qz.question_generator import QuestionGenerator
# Extract text from PDF
extractor = TextExtractor()
text = extractor.extract_text_from_pdf("path/to/your/document.pdf")
# Generate quiz questions
generator = QuestionGenerator()
questions = generator.generate_question_from_text(text, num_questions=5)
# Process the questions
for i, question in enumerate(questions, 1):
print(f"Question {i}: {question['text']}")
print("Options:", question['options'])
print("Correct Answer:", question['options'][question['correct_answer']])
print()
Advanced Features
- OCR Support: Automatically extracts text from scanned PDFs using EasyOCR
- Fact Checking: Validates generated questions against source content
- Quality Analysis: Ensures generated questions meet educational standards
- Batch Processing: Efficiently processes multiple questions in parallel
Requirements
- Python 3.11
- Dependencies (automatically installed):
- torch: Deep learning support
- spacy: NLP processing
- transformers: Question generation
- pypdf: PDF processing
- easyocr: OCR support
- reportlab: PDF generation
- ollama: LLM integration
See pyproject.toml for full dependency list.
Development
- Clone the repository
- Install development dependencies:
pip install -r requirements-dev.txt
- Install pre-commit hooks:
pre-commit install - Run tests:
pytest
License
MIT License - See LICENSE file for details
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
pdfxact-0.1.2.tar.gz
(15.0 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
pdfxact-0.1.2-py3-none-any.whl
(14.0 kB
view details)
File details
Details for the file pdfxact-0.1.2.tar.gz.
File metadata
- Download URL: pdfxact-0.1.2.tar.gz
- Upload date:
- Size: 15.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c013801b15cb0216b9d90b33887f87cf7f8aeb2654c94ad0bcfe6209c2facaaa
|
|
| MD5 |
d31f95e5297db55e5833fcfb4f4c885d
|
|
| BLAKE2b-256 |
0b2d866417e8678e967ce9d83355bb33566069da0523c127a7f658b99df0d445
|
File details
Details for the file pdfxact-0.1.2-py3-none-any.whl.
File metadata
- Download URL: pdfxact-0.1.2-py3-none-any.whl
- Upload date:
- Size: 14.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ad6aee4da5b2ff51438b329faeb52659311fcb5eb5d80939759511ac1f12029a
|
|
| MD5 |
d506f98317eca272c5b8c1b86cca3de4
|
|
| BLAKE2b-256 |
7ba9989ec6b2f21456a7b204d5c613b590ac1c47d5a4692132c0b76d65443d9a
|