Skip to main content

PDF Quiz Generator

Project description

PDFxact

A powerful PDF Quiz Generator that extracts content from PDFs and automatically generates educational questions with fact checking capabilities.

Installation

pip install pdfxact

Features

  • PDF text extraction with OCR support
  • AI-powered question generation from PDF content
  • Automated fact checking and validation
  • Text preprocessing and quality analysis
  • Batch processing for efficient question generation
  • Customizable number of questions

Usage

Command Line Interface

# Generate 5 questions from a PDF file
qz path/to/your/document.pdf

# Generate a specific number of questions
qz path/to/your/document.pdf --num_questions 10

# Control batch processing size
qz path/to/your/document.pdf --num_questions 20 --batch_size 8

Python API

from qz.text_extractor import TextExtractor
from qz.question_generator import QuestionGenerator

# Extract text from PDF
extractor = TextExtractor()
text = extractor.extract_text_from_pdf("path/to/your/document.pdf")

# Generate quiz questions
generator = QuestionGenerator()
questions = generator.generate_question_from_text(text, num_questions=5)

# Process the questions
for i, question in enumerate(questions, 1):
    print(f"Question {i}: {question['text']}")
    print("Options:", question['options'])
    print("Correct Answer:", question['options'][question['correct_answer']])
    print()

Advanced Features

  • OCR Support: Automatically extracts text from scanned PDFs using EasyOCR
  • Fact Checking: Validates generated questions against source content
  • Quality Analysis: Ensures generated questions meet educational standards
  • Batch Processing: Efficiently processes multiple questions in parallel

Requirements

  • Python 3.11
  • Dependencies (automatically installed):
    • torch: Deep learning support
    • spacy: NLP processing
    • transformers: Question generation
    • pypdf: PDF processing
    • easyocr: OCR support
    • reportlab: PDF generation
    • ollama: LLM integration

See pyproject.toml for full dependency list.

Development

  1. Clone the repository
  2. Install development dependencies:
    pip install -r requirements-dev.txt
    
  3. Install pre-commit hooks:
    pre-commit install
    
  4. Run tests:
    pytest
    

License

MIT License - See LICENSE file for details

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pdfxact-0.1.2.tar.gz (15.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pdfxact-0.1.2-py3-none-any.whl (14.0 kB view details)

Uploaded Python 3

File details

Details for the file pdfxact-0.1.2.tar.gz.

File metadata

  • Download URL: pdfxact-0.1.2.tar.gz
  • Upload date:
  • Size: 15.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.11

File hashes

Hashes for pdfxact-0.1.2.tar.gz
Algorithm Hash digest
SHA256 c013801b15cb0216b9d90b33887f87cf7f8aeb2654c94ad0bcfe6209c2facaaa
MD5 d31f95e5297db55e5833fcfb4f4c885d
BLAKE2b-256 0b2d866417e8678e967ce9d83355bb33566069da0523c127a7f658b99df0d445

See more details on using hashes here.

File details

Details for the file pdfxact-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: pdfxact-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 14.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.11

File hashes

Hashes for pdfxact-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 ad6aee4da5b2ff51438b329faeb52659311fcb5eb5d80939759511ac1f12029a
MD5 d506f98317eca272c5b8c1b86cca3de4
BLAKE2b-256 7ba9989ec6b2f21456a7b204d5c613b590ac1c47d5a4692132c0b76d65443d9a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page