Skip to main content

PDF Quiz Generator

Project description

PDFxact

A powerful PDF Quiz Generator that extracts content from PDFs and automatically generates educational questions with fact checking capabilities.

Installation

pip install pdfxact

Features

  • PDF text extraction with OCR support
  • AI-powered question generation from PDF content
  • Automated fact checking and validation
  • Text preprocessing and quality analysis
  • Batch processing for efficient question generation
  • Customizable number of questions

Usage

Command Line Interface

# Generate 5 questions from a PDF file
qz path/to/your/document.pdf

# Generate a specific number of questions
qz path/to/your/document.pdf --num_questions 10

# Control batch processing size
qz path/to/your/document.pdf --num_questions 20 --batch_size 8

Python API

from qz.text_extractor import TextExtractor
from qz.question_generator import QuestionGenerator

# Extract text from PDF
extractor = TextExtractor()
text = extractor.extract_text_from_pdf("path/to/your/document.pdf")

# Generate quiz questions
generator = QuestionGenerator()
questions = generator.generate_question_from_text(text, num_questions=5)

# Process the questions
for i, question in enumerate(questions, 1):
    print(f"Question {i}: {question['text']}")
    print("Options:", question['options'])
    print("Correct Answer:", question['options'][question['correct_answer']])
    print()

Advanced Features

  • OCR Support: Automatically extracts text from scanned PDFs using EasyOCR
  • Fact Checking: Validates generated questions against source content
  • Quality Analysis: Ensures generated questions meet educational standards
  • Batch Processing: Efficiently processes multiple questions in parallel

Requirements

  • Python 3.11
  • Dependencies (automatically installed):
    • torch: Deep learning support
    • spacy: NLP processing
    • transformers: Question generation
    • pypdf: PDF processing
    • easyocr: OCR support
    • reportlab: PDF generation
    • ollama: LLM integration

See pyproject.toml for full dependency list.

Development

  1. Clone the repository
  2. Install development dependencies:
    pip install -r requirements-dev.txt
    
  3. Install pre-commit hooks:
    pre-commit install
    
  4. Run tests:
    pytest
    

License

MIT License - See LICENSE file for details

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pdfxact-0.1.1-py3-none-any.whl (14.0 kB view details)

Uploaded Python 3

File details

Details for the file pdfxact-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: pdfxact-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 14.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.11

File hashes

Hashes for pdfxact-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 cc36eedc47421cb218b1b43f2981d503c66bfe1292afbc0cf5248af048442863
MD5 320dc5c8bdc299dd6b28b2ed23ca43d7
BLAKE2b-256 dd8830827e8bbaa9f24a54e18e10111ea82d458c8d8240600a7444b3c3158a71

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page