Skip to main content

Generate realistic AI/ML test data: 65+ models, 45+ companies, 25+ frameworks, 30+ datasets, architectures, tasks, and parameters

Project description

faker-ai-provider

Faker provider for generating AI/ML-related fake data with correlated relationships between models, companies, architectures, and capabilities.

Model Data Updated: February 2026 (includes Claude Opus 4.6, GPT-5.2, Gemini 3, LLaMA 4, and more)

Installation

pip install faker-ai-provider

Quick Start

from faker import Faker
from faker_ai import AiProvider

fake = Faker()
fake.add_provider(AiProvider)

# Generate correlated AI data
fake.ai_model()           # 'Claude Opus 4.6'
fake.ai_company()         # 'Anthropic'
fake.full_ai_model_spec() # 'GPT-5.2 by OpenAI: Transformer architecture, 1.5T parameters, for reasoning.'

Seeding for Reproducibility

Use Faker's seeding to generate consistent, reproducible data across runs:

fake = Faker()
fake.add_provider(AiProvider)
fake.seed_instance(42)

# These will always return the same values with seed 42
print(fake.ai_model())    # Always 'LLaMA 3 70B'
print(fake.ai_company())  # Always 'Apple'

Available Methods

Basic Methods

Method Example
ai_model() GPT-5.2, Claude Opus 4.6, Gemini 3 Flash
ai_company() OpenAI, Anthropic, Google DeepMind
ai_architecture() Transformer, Diffusion, Mixture of Experts
ai_task() text-generation, code-generation, reasoning
ai_modality() text, image, audio, video
ml_framework() PyTorch, TensorFlow, LangChain
ai_dataset() ImageNet, COCO, MMLU, FineWeb

Correlation Methods

Method Description
ai_model_for_company(company) Get a model from a specific company
ai_company_for_model(model) Get the company that created a model
ai_tasks_for_model(model) Get tasks supported by a model
ai_models_for_task(task) Get models that support a task
ai_models_by_architecture(arch) Filter models by architecture
ai_models_by_modality(modality) Filter models by modality
model_scenario(model=None) Get complete correlated model data

Composite Methods

Method Description
full_ai_model_spec() Formatted spec: "Model by Company: arch, params, for task."
ai_training_run() Dict with model, framework, dataset, task
ai_deployment() Dict with model, endpoint, version, status
ai_experiment() Dict with experiment_id, accuracy, loss, epochs

Advanced Usage

Populate a Database with AI Records

from faker import Faker
from faker_ai import AiProvider

fake = Faker()
fake.add_provider(AiProvider)

# Generate 100 AI deployment records
deployments = [fake.ai_deployment() for _ in range(100)]

# Generate experiment tracking data
experiments = [fake.ai_experiment() for _ in range(50)]

Generate ML Pipeline Configuration

fake.seed_instance(42)  # Reproducible pipeline

pipeline = {
    "name": f"pipeline-{fake.random_int(1000, 9999)}",
    "training": fake.ai_training_run(),
    "deployment": fake.ai_deployment(),
    "experiment": fake.ai_experiment(),
}

Filter Models by Capability

# Get all models that support code generation
code_models = fake.ai_models_for_task("code-generation")

# Get all diffusion models
diffusion_models = fake.ai_models_by_architecture("Diffusion")

# Get all multimodal models
video_models = fake.ai_models_by_modality("video")

Model Scenario

Get complete, correlated model information:

scenario = fake.model_scenario()
# {
#     'model': 'GPT-5.2',
#     'company': 'OpenAI',
#     'architecture': 'Transformer',
#     'modality': ['text', 'image', 'audio', 'video'],
#     'tasks': ['text-generation', 'reasoning', 'code-generation', ...],
#     'parameters': '1.5T',
#     'release_year': 2026
# }

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

faker_ai_provider-2.0.0.tar.gz (10.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

faker_ai_provider-2.0.0-py3-none-any.whl (9.9 kB view details)

Uploaded Python 3

File details

Details for the file faker_ai_provider-2.0.0.tar.gz.

File metadata

  • Download URL: faker_ai_provider-2.0.0.tar.gz
  • Upload date:
  • Size: 10.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for faker_ai_provider-2.0.0.tar.gz
Algorithm Hash digest
SHA256 74a765149fc3621c3f77f025d59c565177b94f7cdb95b10bd1d19ddde662f6cc
MD5 d5a1864a42333404206dfad4132f59cf
BLAKE2b-256 89332b02c8352ac6cfc8a9f03bec73625a75b432b24aeaad7f1bbb4ad0749fe9

See more details on using hashes here.

File details

Details for the file faker_ai_provider-2.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for faker_ai_provider-2.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 bbd6ef571f5f657debde28daaeab8c383282e7a4f7d7e7d77c5c367b03c7cd4f
MD5 7808db78f1ce677c97ed2538fe5518ee
BLAKE2b-256 00a21e66d2a8c6332c509fc52eb9b7e7db358fe02b8816644e5742d25d98b7c1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page