AI-powered quiz generator for regulatory, certification, and educational documentation
Project description
quiz-gen
AI-powered quiz generator for regulatory documentation. Extract structured content from complex legal and technical documents to create comprehensive teaching and certification materials.
Features
- Multi-Agent Quiz Generation: Generate, validate, refine, and judge questions using configurable providers/models.
- EUR-Lex Document Parser: Parse and structure EU legal documents with full table of contents extraction
- Human-in-the-Loop: Integrate human input throughout the workflow.
Tech Stack
Backend
Python — core package language
FastAPI — serves the web UI and REST API from within the package
AI Providers
OpenAI
Anthropic
Google (Gemini)
Mistral
Cohere
Web UI
React — interactive frontend
Vite — fast dev server and production bundler (outputs to
quiz_gen/ui/static)Tailwind CSS — utility-first styling
JavaScript (JSX) — component and API code
CLI
argparse — flag-based CLI (
input,--output,--chunks,--toc,--print-toc,--no-save,--verbose,--version)
Packaging
PyPI — distributed as an installable Python package
Installation
pip install quiz-gen
Quick Start
Multi-Agent Quiz Generation
Quiz generation uses four specialized agents (conceptual, practical, validator, refiner, and judge). Providers are configurable per agent, with supported providers: Anthropic, Cohere, Google, Mistral, and OpenAI. Any text-generation model name from these providers can be passed directly. The package relies on provider defaults for generation parameters.
Multi-Agent Architecture and Configuration
from quiz_gen.agents.workflow import QuizGenerationWorkflow
from quiz_gen.agents.config import AgentConfig
config = AgentConfig(
conceptual_provider="cohere",
conceptual_model="command-a-03-2025",
practical_provider="google",
practical_model="gemini-3-pro-preview",
validator_provider="openai",
validator_model="gpt-5.2-2025-12-11",
refiner_provider="anthropic",
refiner_model="claude-sonnet-4-5-20250929",
judge_provider="mistral",
judge_model="mistral-large-latest",
)
workflow = QuizGenerationWorkflow(config)
result = workflow.run(chunk)
Parsing EUR-Lex Documents
from quiz_gen import EURLexParser
# Parse a regulation document
url = "https://eur-lex.europa.eu/legal-content/EN/TXT/HTML/?uri=OJ:L_202401689"
parser = EURLexParser(url=url)
chunks, toc = parser.parse()
# Access structured content
print(f"Extracted {len(chunks)} content chunks")
print(f"Document has {len(toc['sections'])} major sections")
# Save results
parser.save_chunks('output_chunks.json')
parser.save_toc('output_toc.json')
Working with Chunks
# Iterate through extracted chunks
for chunk in chunks:
print(f"{chunk.title}")
print(f"Type: {chunk.section_type.value}")
print(f"Number: {chunk.number}")
print(f"Content: {chunk.content[:200]}...")
print(f"Hierarchy: {' > '.join(chunk.hierarchy_path)}")
print()
Displaying Table of Contents
# Print formatted TOC
parser.print_toc()
# Output:
# PREAMBLE
# Citation
# Recital 1
# Recital 2
# ...
#
# ENACTING TERMS
# CHAPTER I - PRINCIPLES
# Article 1 - Subject matter and objectives
# Article 2 - Scope
Development
Setting up Development Environment
# Clone the repository
git clone https://github.com/yauheniya-ai/quiz-gen.git
cd quiz-gen
# Install with development dependencies
pip install -e ".[dev]"
# Run tests
pytest --cov=src --cov-report=term-missing
# Run linting
ruff check .
black .
Project Structure
quiz-gen/
├── data/ # Local data files
│ ├── raw/ # Source HTML documents
│ ├── processed/ # Parsed chunks and TOC JSON
│ └── quizzes/ # Generated quiz output
├── docs/ # MkDocs documentation source
├── examples/ # Runnable example scripts
│ ├── eur_lex_html_file.py
│ ├── eur_lex_html_url.py
│ └── quiz_gen_multi_model.py
├── src/
│ └── quiz_gen/ # Package source
│ ├── agents/ # Multi-agent system
│ │ ├── config.py # AgentConfig dataclass
│ │ ├── conceptual_generator.py
│ │ ├── practical_generator.py
│ │ ├── validator.py
│ │ ├── refiner.py
│ │ ├── judge.py
│ │ └── workflow.py # LangGraph orchestration
│ ├── parsers/
│ │ └── html/
│ │ └── eur_lex_parser.py
│ ├── ui/ # FastAPI + React static bundle
│ │ ├── server.py
│ │ ├── api.py
│ │ └── static/
│ ├── utils/
│ │ └── helpers.py
│ └── cli.py
├── tests/
│ ├── test_agents/
│ ├── test_cli/
│ ├── test_parsers/
│ └── test_utils/
├── pyproject.toml
├── README.md
├── CHANGELOG.md
└── .env
API Reference
AgentConfig
Dataclass that configures every agent in the multi-agent pipeline. API keys and base URLs are loaded automatically from environment variables when not provided directly.
Provider / model settings (per agent – defaults shown):
| Parameter | Default provider | Default model |
|---|---|---|
conceptual_provider / conceptual_model |
openai |
gpt-4o |
practical_provider / practical_model |
anthropic |
claude-sonnet-4-20250514 |
validator_provider / validator_model |
openai |
gpt-4o |
refiner_provider / refiner_model |
openai |
gpt-4o |
judge_provider / judge_model |
anthropic |
claude-sonnet-4-20250514 |
Supported provider values: openai, anthropic, google, mistral, cohere.
Workflow settings:
auto_accept_valid: bool = False— skip judge when validation already passessave_intermediate_results: bool = Trueoutput_directory: str = "data/quizzes"min_validation_score: int = 6— minimum score (out of 10) to pass validationstrict_validation: bool = Truemax_retries: int = 3verbose: bool = True
Methods:
validate()— raisesValueErrorif config is invalidsave(filepath, verbose=False)— write config to JSONload(filepath)(classmethod) — load config from JSONprint_summary()— print a human-readable config table
QuizGenerationWorkflow
LangGraph-based orchestration of the five-agent pipeline.
from quiz_gen.agents.workflow import QuizGenerationWorkflow
from quiz_gen.agents.config import AgentConfig
config = AgentConfig() # reads API keys from environment
workflow = QuizGenerationWorkflow(config)
# Single chunk
result = workflow.run(chunk)
# Batch
results = workflow.run_batch(chunks, save_output=True, output_dir="data/quizzes")
Methods:
run(chunk, improvement_feedback=None)→Dict— run the full pipeline for one chunk; returns full state includingfinal_questions,judge_decision,validation_results, anderrorsrun_batch(chunks, save_output=True, output_dir="data/quizzes")→List[Dict]— run for multiple chunks, optionally saving each result to JSON
Individual Agents
Agents can be used standalone outside of the workflow:
from quiz_gen.agents.conceptual_generator import ConceptualGenerator
from quiz_gen.agents.practical_generator import PracticalGenerator
from quiz_gen.agents.validator import Validator
from quiz_gen.agents.refiner import Refiner
from quiz_gen.agents.judge import Judge
| Class | Key method | Returns |
|---|---|---|
ConceptualGenerator |
generate(chunk, improvement_feedback=None) |
Dict question |
PracticalGenerator |
generate(chunk, improvement_feedback=None) |
Dict question |
Validator |
validate(qa, chunk) / validate_batch(qas, chunk) |
Dict / List[Dict] |
Refiner |
refine(qa, validation_result, chunk) / refine_batch(qas, validation_results, chunk) |
Dict / List[Dict] |
Judge |
judge(conceptual_qa, practical_qa, chunk) |
Dict with decision and reasoning |
EURLexParser
Main parser class for EUR-Lex documents.
Methods:
parse()->tuple[List[RegulationChunk], Dict]: Parse document and return chunks and TOCfetch()->str: Fetch HTML content from URLsave_chunks(filepath: str): Save chunks to JSON filesave_toc(filepath: str): Save table of contents to JSON fileprint_toc(): Display formatted table of contents
RegulationChunk
Represents a parsed content chunk (article or recital).
Attributes:
section_type: Type of section (ARTICLE, RECITAL, etc.)number: Section number (e.g., "1", "42")title: Full title including subtitlecontent: Text contenthierarchy_path: List of parent sectionsmetadata: Additional structured data
SectionType
Enumeration of document section types.
Values:
PREAMBLE: Preamble sectionENACTING_TERMS: Main regulatory contentCITATION: Citation in preambleRECITAL: Recital in preambleCHAPTER: Chapter divisionSECTION: Section within chapterARTICLE: Article (main content unit)ANNEX: Annex section
Use Cases
Compliance and Legal
- Analyze regulatory requirements systematically
- Support automated document analysis workflows
- Build searchable knowledge bases from legal texts
Education and Training
- Generate study materials from regulatory documents
- Create structured learning paths for certification programs
- Extract key concepts for examination preparation
Supported Document Types
Currently supports:
- EUR-Lex HTML Documents: European Union regulations, directives, decisions
Document Format Requirements
- Documents must use EUR-Lex HTML format
- Must contain
eli-subdivisionelements for proper structure identification - Supports multi-level hierarchies with chapters, sections, and articles
TODOs
- [] Support for additional document formats (PDF, DOCX, PPTX)
- [] Save results by project in a local database
- [] Multi-language support for UI
- [] Light/Dark scheme for UI
License
This project is licensed under the MIT License. See the LICENSE file for details.
Citation
If you use this software in academic work, please cite:
Varabyova, Y. (2026). Quiz Gen AI: AI-powered quiz generator for professional certification.
GitHub repository: https://github.com/yauheniya-ai/quiz-gen
Support
- Documentation: https://quiz-gen.readthedocs.io
- Issue Tracker: https://github.com/yauheniya-ai/quiz-gen/issues
Contributing
Contributions are welcome! Please ensure:
- Code follows PEP 8 style guidelines
- All tests pass:
pytest --cov=src --cov-report=term-missing - New features include appropriate tests
- Documentation is updated
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file quiz_gen-0.5.2.tar.gz.
File metadata
- Download URL: quiz_gen-0.5.2.tar.gz
- Upload date:
- Size: 43.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6c1e72ea981de607b25cf36caf9180528f572ad08ee965b9d275ad0098bfd4d1
|
|
| MD5 |
2bb13200e8593d679d834f0da263a5f1
|
|
| BLAKE2b-256 |
f6a11d6dab23cb66876a52626ef07a683176d2331193c28e1a8d4e60a1ee16b2
|
File details
Details for the file quiz_gen-0.5.2-py3-none-any.whl.
File metadata
- Download URL: quiz_gen-0.5.2-py3-none-any.whl
- Upload date:
- Size: 46.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a0d84e63a336d2ac3d713dbda1c7c15599eee2cef847534f1bf5c462f16160f3
|
|
| MD5 |
eb78c9d579e3f7b235600a0c37a356ae
|
|
| BLAKE2b-256 |
944c5c54695c18e83093bbf7f358229eabd888e372741865c10f81ae36cb2363
|