Skip to main content

Generate realistic AI/ML test data: 130+ models, 30+ companies, 20+ frameworks, 30+ datasets, architectures, tasks, and parameters

Project description

faker-ai-provider

Faker provider for generating AI/ML-related fake data with correlated relationships between models, companies, architectures, and capabilities.

Model Data Updated: April 2026 (includes GPT-5.3-Codex, Claude Opus 4.7, Gemini 3, Llama 4, Mistral Small 4, Grok 4.20, and more)

Installation

pip install faker-ai-provider

Quick Start

from faker import Faker
from faker_ai import AiProvider

fake = Faker()
fake.add_provider(AiProvider)

# Generate correlated AI data
fake.ai_model()           # 'Claude Opus 4.7'
fake.ai_company()         # 'Anthropic'
fake.full_ai_model_spec() # 'gpt-oss-120b by OpenAI: Transformer architecture, 120B parameters, for reasoning.'

Seeding for Reproducibility

Use Faker's seeding to generate consistent, reproducible data across runs:

fake = Faker()
fake.add_provider(AiProvider)
fake.seed_instance(42)

# These will always return the same values with seed 42
print(fake.ai_model())    # Always 'LLaMA 3 70B'
print(fake.ai_company())  # Always 'Apple'

Available Methods

Basic Methods

Method Example
ai_model() GPT-5.3-Codex, Claude Opus 4.7, Gemini 3 Pro Preview
ai_company() OpenAI, Anthropic, Google DeepMind
ai_architecture() Transformer, Diffusion, Mixture of Experts
ai_task() text-generation, code-generation, reasoning
ai_modality() text, image, audio, video
ml_framework() PyTorch, TensorFlow, LangChain
ai_dataset() ImageNet, COCO, MMLU, FineWeb

Correlation Methods

Method Description
ai_model_for_company(company) Get a model from a specific company
ai_company_for_model(model) Get the company that created a model
ai_tasks_for_model(model) Get tasks supported by a model
ai_models_for_task(task) Get models that support a task
ai_models_by_architecture(arch) Filter models by architecture
ai_models_by_modality(modality) Filter models by modality
model_scenario(model=None) Get complete correlated model data

Composite Methods

Method Description
full_ai_model_spec() Formatted spec: "Model by Company: arch, params, for task."
ai_training_run() Dict with model, framework, dataset, task
ai_deployment() Dict with model, endpoint, version, status
ai_experiment() Dict with experiment_id, accuracy, loss, epochs

Advanced Usage

Populate a Database with AI Records

from faker import Faker
from faker_ai import AiProvider

fake = Faker()
fake.add_provider(AiProvider)

# Generate 100 AI deployment records
deployments = [fake.ai_deployment() for _ in range(100)]

# Generate experiment tracking data
experiments = [fake.ai_experiment() for _ in range(50)]

Generate ML Pipeline Configuration

fake.seed_instance(42)  # Reproducible pipeline

pipeline = {
    "name": f"pipeline-{fake.random_int(1000, 9999)}",
    "training": fake.ai_training_run(),
    "deployment": fake.ai_deployment(),
    "experiment": fake.ai_experiment(),
}

Filter Models by Capability

# Get all models that support code generation
code_models = fake.ai_models_for_task("code-generation")

# Get all diffusion models
diffusion_models = fake.ai_models_by_architecture("Diffusion")

# Get all multimodal models
video_models = fake.ai_models_by_modality("video")

Model Scenario

Get complete, correlated model information:

scenario = fake.model_scenario()
# {
#     'model': 'GPT-5.2',
#     'company': 'OpenAI',
#     'architecture': 'Transformer',
#     'modality': ['text', 'image', 'audio', 'video'],
#     'tasks': ['text-generation', 'reasoning', 'code-generation', ...],
#     'parameters': 'undisclosed',
#     'release_year': 2026
# }

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

faker_ai_provider-2.1.0.tar.gz (11.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

faker_ai_provider-2.1.0-py3-none-any.whl (11.0 kB view details)

Uploaded Python 3

File details

Details for the file faker_ai_provider-2.1.0.tar.gz.

File metadata

  • Download URL: faker_ai_provider-2.1.0.tar.gz
  • Upload date:
  • Size: 11.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for faker_ai_provider-2.1.0.tar.gz
Algorithm Hash digest
SHA256 aac296d40ac1da57e05bb3f800927e1f25eb3c8c0ed327bea53d161aba3f240d
MD5 56d0d522862ce418b511590c6d7cc471
BLAKE2b-256 11e0fcc35497497c8e780bb9817ddcaea523636b0b5607163c369a7e06435b5a

See more details on using hashes here.

File details

Details for the file faker_ai_provider-2.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for faker_ai_provider-2.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 be63669a7616a509dce823ee8bd42cb94ba510edacc3dec8da3f9d496415daed
MD5 a6633b08bce8a1195061e15d91105d2a
BLAKE2b-256 cf2fbc5dfc42ac45d9b03c2a65860b15d22201295f57c3ff7f79625158fdfa30

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page