Generate realistic AI/ML test data: 130+ models, 30+ companies, 20+ frameworks, 30+ datasets, architectures, tasks, and parameters
Project description
faker-ai-provider
Faker provider for generating AI/ML-related fake data with correlated relationships between models, companies, architectures, and capabilities.
Model Data Updated: April 2026 (includes GPT-5.3-Codex, Claude Opus 4.7, Gemini 3, Llama 4, Mistral Small 4, Grok 4.20, and more)
Installation
pip install faker-ai-provider
Quick Start
from faker import Faker
from faker_ai import AiProvider
fake = Faker()
fake.add_provider(AiProvider)
# Generate correlated AI data
fake.ai_model() # 'Claude Opus 4.7'
fake.ai_company() # 'Anthropic'
fake.full_ai_model_spec() # 'gpt-oss-120b by OpenAI: Transformer architecture, 120B parameters, for reasoning.'
Seeding for Reproducibility
Use Faker's seeding to generate consistent, reproducible data across runs:
fake = Faker()
fake.add_provider(AiProvider)
fake.seed_instance(42)
# These will always return the same values with seed 42
print(fake.ai_model()) # Always 'LLaMA 3 70B'
print(fake.ai_company()) # Always 'Apple'
Available Methods
Basic Methods
| Method | Example |
|---|---|
ai_model() |
GPT-5.3-Codex, Claude Opus 4.7, Gemini 3 Pro Preview |
ai_company() |
OpenAI, Anthropic, Google DeepMind |
ai_architecture() |
Transformer, Diffusion, Mixture of Experts |
ai_task() |
text-generation, code-generation, reasoning |
ai_modality() |
text, image, audio, video |
ml_framework() |
PyTorch, TensorFlow, LangChain |
ai_dataset() |
ImageNet, COCO, MMLU, FineWeb |
Correlation Methods
| Method | Description |
|---|---|
ai_model_for_company(company) |
Get a model from a specific company |
ai_company_for_model(model) |
Get the company that created a model |
ai_tasks_for_model(model) |
Get tasks supported by a model |
ai_models_for_task(task) |
Get models that support a task |
ai_models_by_architecture(arch) |
Filter models by architecture |
ai_models_by_modality(modality) |
Filter models by modality |
model_scenario(model=None) |
Get complete correlated model data |
Composite Methods
| Method | Description |
|---|---|
full_ai_model_spec() |
Formatted spec: "Model by Company: arch, params, for task." |
ai_training_run() |
Dict with model, framework, dataset, task |
ai_deployment() |
Dict with model, endpoint, version, status |
ai_experiment() |
Dict with experiment_id, accuracy, loss, epochs |
Advanced Usage
Populate a Database with AI Records
from faker import Faker
from faker_ai import AiProvider
fake = Faker()
fake.add_provider(AiProvider)
# Generate 100 AI deployment records
deployments = [fake.ai_deployment() for _ in range(100)]
# Generate experiment tracking data
experiments = [fake.ai_experiment() for _ in range(50)]
Generate ML Pipeline Configuration
fake.seed_instance(42) # Reproducible pipeline
pipeline = {
"name": f"pipeline-{fake.random_int(1000, 9999)}",
"training": fake.ai_training_run(),
"deployment": fake.ai_deployment(),
"experiment": fake.ai_experiment(),
}
Filter Models by Capability
# Get all models that support code generation
code_models = fake.ai_models_for_task("code-generation")
# Get all diffusion models
diffusion_models = fake.ai_models_by_architecture("Diffusion")
# Get all multimodal models
video_models = fake.ai_models_by_modality("video")
Model Scenario
Get complete, correlated model information:
scenario = fake.model_scenario()
# {
# 'model': 'GPT-5.2',
# 'company': 'OpenAI',
# 'architecture': 'Transformer',
# 'modality': ['text', 'image', 'audio', 'video'],
# 'tasks': ['text-generation', 'reasoning', 'code-generation', ...],
# 'parameters': 'undisclosed',
# 'release_year': 2026
# }
License
MIT
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file faker_ai_provider-2.1.0.tar.gz.
File metadata
- Download URL: faker_ai_provider-2.1.0.tar.gz
- Upload date:
- Size: 11.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
aac296d40ac1da57e05bb3f800927e1f25eb3c8c0ed327bea53d161aba3f240d
|
|
| MD5 |
56d0d522862ce418b511590c6d7cc471
|
|
| BLAKE2b-256 |
11e0fcc35497497c8e780bb9817ddcaea523636b0b5607163c369a7e06435b5a
|
File details
Details for the file faker_ai_provider-2.1.0-py3-none-any.whl.
File metadata
- Download URL: faker_ai_provider-2.1.0-py3-none-any.whl
- Upload date:
- Size: 11.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
be63669a7616a509dce823ee8bd42cb94ba510edacc3dec8da3f9d496415daed
|
|
| MD5 |
a6633b08bce8a1195061e15d91105d2a
|
|
| BLAKE2b-256 |
cf2fbc5dfc42ac45d9b03c2a65860b15d22201295f57c3ff7f79625158fdfa30
|