AI-Powered Mathematical Competition Problem Generation Package
Project description
Questions-Gen: AI-Powered Mathematical Competition Problem Generation
Questions-Gen is a professional mathematical competition problem generation system based on Qwen3-14B, implementing a three-stage training strategy: Basic Pretraining โ RL GRPO Optimization โ Knowledge Distillation, specifically designed for generating high-quality mathematical competition problems.
๐ Key Features
- ๐ฏ Three-Stage Training Pipeline: Basic Pretraining โ RL GRPO Optimization โ Knowledge Distillation
- ๐ Intelligent Problem Variation Generation: Create smart variations of existing problems
- ๐ Multi-Dimensional Quality Assessment: Comprehensive problem quality evaluation system
- ๐ค Teacher Model Integration: Knowledge distillation from DeepSeek-R1
- ๐ Ollama Integration: Convenient local inference deployment
- ๐ Batch Validation Tools: Large-scale model comparison testing
- ๐ Full Precision Models: Original FP16 precision without quantization loss
- โจ๏ธ Professional CLI Tools: Advanced command-line interface
๐ Available Models
| Training Stage | HuggingFace Model | Description | Downloads |
|---|---|---|---|
| Stage 1 | xingqiang/QuestionsGen-Qwen3-14b-stage1-fp-merged |
Basic mathematical problem generation | 4+ |
| Stage 2 | xingqiang/questions-gen-qwen3-14b-stage2-merged-16bit |
GRPO optimization + variation generation | 3+ |
| Final | xingqiang/questions-gen-qwen3-14b-final-merged-16bit |
Complete knowledge distillation version | 3+ |
๐ Quick Start
Installation
# Install from PyPI
pip install questions-gen
# Install from source (development version)
git clone https://github.com/xingqiang/questions-gen.git
cd questions-gen
pip install -e .
Basic Usage
Quick Demonstrations
# Run complete functionality demo
python examples/quick_demo.py
# Model validation demo
python examples/demo_model_validation.py
# Ollama deployment demo
python examples/demo_ollama_push.py
Command Line Interface
# Validate final model
questions-gen validate --model final --tests 5
# Batch validation across categories
questions-gen batch --category algebra --tests 3 --export-csv
# Quality assessment
questions-gen quality "Find the derivative of f(x) = xยณ + 2xยฒ - 5x + 1" --detailed
# Ollama integration
questions-gen ollama --push-all
questions-gen ollama --test questions-gen-final
# HuggingFace tools
questions-gen hf --verify --compare
Model Training
# Custom training (requires GPU)
python scripts/questions_gen_training.py
Deployment Tools
# Quick import to Ollama
python tools/ollama_import.py
# Complete download and conversion
python tools/download_and_convert.py
Python API
from questions_gen import QuestionsGenTrainer
from questions_gen.validation import ModelValidator, BatchValidator, QualityEvaluator
from questions_gen.utils import OllamaManager, HuggingFaceUtils
# Model validation
validator = ModelValidator()
results = validator.validate_single_model(
"xingqiang/questions-gen-qwen3-14b-final-merged-16bit",
num_tests=5
)
# Batch validation
batch_validator = BatchValidator()
batch_results = batch_validator.comparative_batch_validation(
category="calculus",
tests_per_category=5
)
# Quality evaluation
evaluator = QualityEvaluator()
evaluation = evaluator.comprehensive_evaluation(
"Prove that the square root of 2 is irrational."
)
print(f"Quality Score: {evaluation['overall_score']:.3f}")
# Ollama integration
ollama = OllamaManager()
ollama.push_all_models()
๐ Performance Results
Model Comparison
Based on comprehensive mathematical domain testing:
| Model | Avg Quality Score | Generation Speed | Teacher Rating | Best Use Case |
|---|---|---|---|---|
| Final | 0.847 | 2.1s | 4.2/5.0 | Professional competitions |
| Stage 2 | 0.782 | 1.8s | 3.8/5.0 | Problem variations |
| Stage 1 | 0.695 | 1.5s | 3.4/5.0 | Basic problem generation |
Category Performance
| Mathematical Domain | Final Model | Stage 2 | Stage 1 |
|---|---|---|---|
| Algebra | 0.863 | 0.798 | 0.712 |
| Calculus | 0.891 | 0.815 | 0.698 |
| Geometry | 0.824 | 0.763 | 0.681 |
| Number Theory | 0.859 | 0.785 | 0.704 |
๐ง Advanced Features
Custom Model Training
from questions_gen import QuestionsGenTrainer, TrainingConfig
# Configure training parameters
config = TrainingConfig()
config.MAX_STEPS_STAGE1 = 100
config.PRESERVE_FULL_PRECISION = True
# Complete three-stage training
trainer = QuestionsGenTrainer()
trainer.train_full_pipeline()
Quality Assessment System
from questions_gen.validation import QualityEvaluator
evaluator = QualityEvaluator()
# Comprehensive evaluation
question = "Find all solutions to xโด - 5xยฒ + 6 = 0"
evaluation = evaluator.comprehensive_evaluation(question)
print(f"Overall Score: {evaluation['overall_score']:.3f}")
print(f"Grade: {evaluation['grade']}")
print(f"Recommendations: {evaluation['recommendations']}")
Batch Processing
from questions_gen.validation import BatchValidator
batch_validator = BatchValidator()
# Multi-model comparison testing
results = batch_validator.comparative_batch_validation(
models=[
"xingqiang/questions-gen-qwen3-14b-stage2-merged-16bit",
"xingqiang/questions-gen-qwen3-14b-final-merged-16bit"
],
category="all",
tests_per_category=3
)
# Export results
batch_validator.export_results_to_csv(results)
๐ณ Ollama Local Deployment
Convenient local inference deployment:
# Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh
# Push Questions-Gen models
questions-gen ollama --push-all
# Use the model
ollama run questions-gen-final "Generate a calculus competition problem:"
API Usage
import requests
def generate_problem(prompt):
response = requests.post('http://localhost:11434/api/generate',
json={
'model': 'questions-gen-final',
'prompt': prompt,
'stream': False
})
return response.json()['response']
problem = generate_problem("Create a number theory competition problem:")
print(problem)
๐งช Testing and Validation
Comprehensive Testing
# Model validation
questions-gen validate --model all --tests 5 --save
# Category-specific testing
questions-gen batch --category geometry --tests 5 --parallel
# Quality evaluation
questions-gen quality "Prove that โ2 is irrational" --detailed
Model Comparison
# Compare all models
questions-gen compare --all-models --tests 5
# HuggingFace status check
questions-gen hf --compare --health
๐ Evaluation Dimensions
Quality Assessment Metrics
- Mathematical Content: Concept diversity and complexity
- Clarity: Problem statement clarity and structure
- Difficulty: Appropriate challenge level
- Completeness: Problem setup and constraints
- Originality: Innovation and creativity
- Educational Value: Learning objectives and pedagogy
Validation Categories
- Algebra: Equations, polynomials, abstract algebra
- Geometry: Euclidean, coordinate, solid geometry
- Calculus: Derivatives, integrals, optimization
- Number Theory: Primes, modular arithmetic, Diophantine equations
- Combinatorics: Counting, permutations, graph theory
- Analysis: Real analysis, sequences, convergence
๐ ๏ธ Development Guide
Project Structure
questions-gen/
โโโ questions_gen/ # Main package code
โ โโโ cli/ # Command-line interface
โ โโโ core/ # Core functionality
โ โโโ data/ # Data processing
โ โโโ models/ # Model components
โ โโโ utils/ # Utility functions
โ โโโ validation/ # Validation system
โโโ docs/ # Complete documentation
โ โโโ guides/ # User guides
โ โโโ technical/ # Technical documentation
โ โโโ training/ # Training-related docs
โโโ examples/ # Demo scripts
โโโ scripts/ # Training scripts
โโโ tools/ # Utility scripts
โโโ tests/ # Test files
Development Environment Setup
git clone https://github.com/xingqiang/questions-gen.git
cd questions-gen
pip install -e ".[dev]"
Running Tests
pytest tests/ -v --cov=questions_gen
Code Formatting
black questions_gen/
isort questions_gen/
flake8 questions_gen/
๐ System Architecture
Three-Stage Training Pipeline
Questions-Gen Training System
โโโ Basic Pretraining (Stage 1)
โ โโโ Historical competition problems (50%)
โ โโโ Conditional variations (30%)
โ โโโ Innovative problem types (20%)
โโโ RL GRPO Optimization (Stage 2)
โ โโโ Group policy generation (8 problems/group)
โ โโโ Multi-dimensional reward function
โ โโโ Novelty constraint layer
โโโ Knowledge Distillation (Stage 3)
โโโ DeepSeek-R1 (difficulty prediction)
โโโ Logic rigor checking
โโโ Innovation assessment
โโโ Educational value scoring
Reward Function System
reward = 0.4 * difficulty + 0.3 * novelty + 0.2 * rigor + 0.1 * diversity
- Difficulty Analysis (40%): Based on keywords and text complexity
- Innovation (30%): Difference from historical problems
- Logic Rigor (20%): Reasoning vocabulary density
- Diversity (10%): Within-group problem variance
๐ Documentation
- Complete Documentation: Documentation center entrance
- User Guide: Complete user manual
- Training Guide: Custom model training
- Technical Documentation: In-depth technical implementation
- API Reference: See docstrings in source code
- Example Code: Demo scripts and usage examples
๐ค Contributing
Contributions are welcome! Please see our Contributing Guidelines for details.
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests
- Submit a Pull Request
๐ License
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
๐ Acknowledgments
- Unsloth: Efficient fine-tuning optimization
- HuggingFace: Model hosting and transformers library
- DeepSeek: Teacher model in knowledge distillation
- Qwen Team: Base Qwen3-14B model
๐ Support
- Issue Reports: GitHub Issues
- Model Downloads: HuggingFace Models
- Discussions: GitHub Discussions
๐ Related Projects
- Unsloth - Fast LLM fine-tuning
- Transformers - State-of-the-art machine learning
- Ollama - Local LLM deployment
Questions-Gen - Advancing mathematical education through AI-powered problem generation. ๐โจ
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file questions_gen-0.1.1.tar.gz.
File metadata
- Download URL: questions_gen-0.1.1.tar.gz
- Upload date:
- Size: 46.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
63b57b721be6ee02ed38dd85ef7ca8814e981432e6f7b00b861513e386baca56
|
|
| MD5 |
37b5492ce717cdeca04d3b79be743da7
|
|
| BLAKE2b-256 |
756f14ef485b07b6bdaa4223832e72a3640d8673184c855ecb031dab37fa3360
|
File details
Details for the file questions_gen-0.1.1-py3-none-any.whl.
File metadata
- Download URL: questions_gen-0.1.1-py3-none-any.whl
- Upload date:
- Size: 48.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c350a9309d1e36df0aff4286d793a206cab8c56d0e6974c06efda53dfc96cf4a
|
|
| MD5 |
62b51d971e06d069a8ed0665e7085718
|
|
| BLAKE2b-256 |
4a38cb1a235db9381003fc7211f6398949400814681798408e8c490a2e7f9275
|