Skip to main content

Mock data generation for pydantic based models

Project description

PyPI - Python Version

Language grade: Python Total alerts Coverage Maintainability Rating Reliability Rating Quality Gate Status



This library offers powerful mock data generation capabilities for pydantic based models and dataclasses. It can also be used with other libraries that use pydantic as a foundation, for example SQLModel and Beanie.

Table of Contents


from datetime import date, datetime
from typing import List, Union

from pydantic import BaseModel, UUID4

from pydantic_factories import ModelFactory

class Person(BaseModel):
    id: UUID4
    name: str
    hobbies: List[str]
    age: Union[float, int]
    birthday: Union[datetime, date]

class PersonFactory(ModelFactory):
    __model__ = Person

result =

That's it - with almost no work, we are able to create a mock data object fitting the Person class model definition.

This is possible because of the typing information available on the pydantic model and model-fields, which are used as a source of truth for data generation.

The factory parses the information stored in the pydantic model and generates a dictionary of kwargs that are passed to the Person class' init method.


  • ✅ supports both built-in and pydantic types
  • ✅ supports pydantic field constraints
  • ✅ supports complex field types
  • ✅ supports custom model fields

Why This Library?

  • 💯 powerful
  • 💯 extensible
  • 💯 simple
  • 💯 rigorously tested

See Frequently Asked Questions (FAQs) for more information.


Using your package manager of choice:

pip install pydantic-factories


poetry add --dev pydantic-factories


pipenv install --dev pydantic-factories

pydantic-factories has very few dependencies aside from pydantic - typing-extensions which is used for typing support in older versions of python, as well as faker and xeger, both of which are used for generating mock data.


Build Methods

The ModelFactory class exposes two build methods:

  • .build(**kwargs) - builds a single instance of the factory's model
  • .batch(size: int, **kwargs) - build a list of size n instances
from pydantic import BaseModel

from pydantic_factories import ModelFactory

class Person(BaseModel):

class PersonFactory(ModelFactory):
    __model__ = Person

single_result =  # a single Person instance

batch_result = PersonFactory.batch(
)  # list[Person, Person, Person, Person, Person]

Any kwargs you pass to .build, .batch or any of the persistence methods, will take precedence over whatever defaults are defined on the factory class itself.

By default, when building a pydantic class, kwargs are validated, to avoid input validation you can use the factory_use_construct param.

from pydantic import BaseModel

from pydantic_factories import ModelFactory

class Person(BaseModel):

class PersonFactory(ModelFactory):
    __model__ = Person  # Raises a validation error

result =
    factory_use_construct=True, id=5
)  # Build a Person with invalid id

Nested Models and Complex types

The automatic generation of mock data works for all types supported by pydantic, as well as nested classes that derive from BaseModel (including for 3rd party libraries) and complex types. Let's look at another example:

from datetime import date, datetime
from enum import Enum
from pydantic import BaseModel, UUID4
from typing import Any, Dict, List, Union

from pydantic_factories import ModelFactory

class Species(str, Enum):
    CAT = "Cat"
    DOG = "Dog"
    PIG = "Pig"
    MONKEY = "Monkey"

class Pet(BaseModel):
    name: str
    sound: str
    species: Species

class Person(BaseModel):
    id: UUID4
    name: str
    hobbies: List[str]
    age: Union[float, int]
    birthday: Union[datetime, date]
    pets: List[Pet]
    assets: List[Dict[str, Dict[str, Any]]]

class PersonFactory(ModelFactory):
    __model__ = Person

result =

This example will also work out of the box although no factory was defined for the Pet class, that's not a problem - a factory will be dynamically generated for it on the fly.

The complex typing under the assets attribute is a bit more tricky, but the factory will generate a python object fitting this signature, therefore passing validation.

Please note: the one thing factories cannot handle is self referencing models, because this can lead to recursion errors. In this case you will need to handle the particular field by setting defaults for it.

Models and Dataclasses

This library works with any class that inherits the pydantic BaseModel class, including GenericModel and classes from 3rd party libraries, and also with dataclasses - both those from the python standard library and pydantic's dataclasses. In fact, you can use them interchangeably as you like:

import dataclasses
from typing import Dict, List

import pydantic
from pydantic_factories import ModelFactory

class MyPydanticDataClass:
    name: str

class MyFirstModel(pydantic.BaseModel):
    dataclass: MyPydanticDataClass

class MyPythonDataClass:
    id: str
    complex_type: Dict[str, Dict[int, List[MyFirstModel]]]

class MySecondModel(pydantic.BaseModel):
    dataclasses: List[MyPythonDataClass]

class MyFactory(ModelFactory):
    __model__ = MySecondModel

result =

The above example will build correctly.

Note Regarding Nested Optional Types in Dataclasses

When generating mock values for fields typed as Optional, if the factory is defined with __allow_none_optionals__ = True, the field value will be either a value or None - depending on a random decision. This works even when the Optional typing is deeply nested, except for dataclasses - typing is only shallowly evaluated for dataclasses, and as such they are always assumed to require a value. If you wish to have a None value, in this particular case, you should do so manually by configured a Use callback for the particular field.

Factory Configuration

Configuration of ModelFactory is done using class variables:

  • __model__: a required variable specifying the model for the factory. It accepts any class that extends _ pydantic's_ BaseModel including classes from other libraries. If this variable is not set, a ConfigurationException will be raised.

  • __faker__: an optional variable specifying a user configured instance of faker. If this variable is not set, the factory will default to using vanilla faker.

  • __sync_persistence__: an optional variable specifying the handler for synchronously persisting data. If this is variable is not set, the .create_sync and .create_batch_sync methods of the factory cannot be used. See: persistence methods

  • __async_persistence__: an optional variable specifying the handler for asynchronously persisting data. If this is variable is not set, the .create_async and .create_batch_async methods of the factory cannot be used. See: persistence methods

  • __allow_none_optionals__: an optional variable specifying whether the factory should randomly set None values for optional fields, or always set a value for them. This is True by default.

from faker import Faker
from pydantic_factories import ModelFactory

from app.models import Person
from .persistence import AsyncPersistenceHandler, SyncPersistenceHandler

my_faker = Faker("en-EN")

class PersonFactory(ModelFactory):
    __model__ = Person
    __faker__ = my_faker
    __sync_persistence__ = SyncPersistenceHandler
    __async_persistence__ = AsyncPersistenceHandler
    __allow_none_optionals__ = False

Generating deterministic objects

In order to generate deterministic data, use ModelFactory.seed_random method. This will pass the seed value to both Faker and random method calls, guaranteeing data to be the same in between the calls. Especially useful for testing.

Defining Factory Attributes

The factory api is designed to be as semantic and simple as possible, lets look at several examples that assume we have the following models:

from datetime import date, datetime
from enum import Enum
from pydantic import BaseModel, UUID4
from typing import Any, Dict, List, Union
from pydantic_factories import ModelFactory

class Species(str, Enum):
    CAT = "Cat"
    DOG = "Dog"

class Pet(BaseModel):
    name: str
    species: Species

class Person(BaseModel):
    id: UUID4
    name: str
    hobbies: List[str]
    age: Union[float, int]
    birthday: Union[datetime, date]
    pets: List[Pet]
    assets: List[Dict[str, Dict[str, Any]]]

pet = Pet(name="Roxy", sound="woof woof", species=Species.DOG)

class PersonFactory(ModelFactory):
    __model__ = Person

    pets = [pet]

In this case when we call the result will be randomly generated, except the pets list, which will be the hardcoded default we defined.

Use (field)

This though is often not desirable. We could instead, define a factory for Pet where we restrict the choices to a range we like. For example:

from datetime import date, datetime
from pydantic import BaseModel, UUID4
from typing import Any, Dict, List, Union
from enum import Enum
from pydantic_factories import ModelFactory, Use
from random import choice

class Species(str, Enum):
    CAT = "Cat"
    DOG = "Dog"

class Pet(BaseModel):
    name: str
    species: Species

class Person(BaseModel):
    id: UUID4
    name: str
    hobbies: List[str]
    age: Union[float, int]
    birthday: Union[datetime, date]
    pets: List[Pet]
    assets: List[Dict[str, Dict[str, Any]]]

class PetFactory(ModelFactory):
    __model__ = Pet

    name = Use(choice, ["Ralph", "Roxy"])
    species = Use(choice, list(Species))

class PersonFactory(ModelFactory):
    __model__ = Person

    pets = Use(PetFactory.batch, size=2)

The signature for use is: cb: Callable, *args, **defaults, it can receive any sync callable. In the above example, we used the choice function from the standard library's random package, and the batch method of PetFactory.

You do not need to use the Use field, you can place callables (including classes) as values for a factory's attribute directly, and these will be invoked at build-time. Thus, you could for example re-write the above PetFactory like so:

from enum import Enum
from pydantic import BaseModel
from random import choice
from pydantic_factories import ModelFactory

class Species(str, Enum):
    CAT = "Cat"
    DOG = "Dog"

class Pet(BaseModel):
    name: str
    species: Species

class PetFactory(ModelFactory):
    __model__ = Pet

    name = lambda: choice(["Ralph", "Roxy"])  # noqa: E731
    species = lambda: choice(list(Species))  # noqa: E731

Use is merely a semantic abstraction that makes the factory cleaner and simpler to understand.

PostGenerated (field)

It allows for post generating fields based on already generated values of other (non post generated) fields. In most cases this pattern is best avoided, but for the few valid cases the PostGenerated helper is provided. For example:

from pydantic import BaseModel
from pydantic_factories import ModelFactory, PostGenerated
from random import randint
from datetime import datetime, timedelta

def add_timedelta(name: str, values: dict, *args, **kwds):
    delta = timedelta(days=randint(0, 12), seconds=randint(13, 13000))
    return values["from_dt"] + delta

class MyModel(BaseModel):
    from_dt: datetime
    to_dt: datetime

class MyFactory(ModelFactory):
    __model__ = MyModel

    to_dt = PostGenerated(add_timedelta)

The signature for use is: cb: Callable, *args, **defaults, it can receive any sync callable. The signature for the callable should be: name: str, values: dict[str, Any], *args, **defaults. The already generated values are mapped by name in the values dictionary.

Ignore (field)

Ignore is another field exported by this library, and its used - as its name implies - to designate a given attribute as ignored:

from typing import TypeVar

from odmantic import EmbeddedModel, Model
from pydantic_factories import ModelFactory, Ignore

T = TypeVar("T", Model, EmbeddedModel)

class OdmanticModelFactory(ModelFactory[T]):
    id = Ignore()

The above example is basically the extension included in pydantic-factories for the library ODMantic, which is a pydantic based mongo ODM.

For ODMantic models, the id attribute should not be set by the factory, but rather handled by the odmantic logic itself. Thus, the id field is marked as ignored.

When you ignore an attribute using Ignore, it will be completely ignored by the factory - that is, it will not be set as a kwarg passed to pydantic at all.

Require (field)

The Require field in turn specifies that a particular attribute is a required kwarg. That is, if a kwarg with a value for this particular attribute is not passed when calling, a MissingBuildKwargError will be raised.

What is the use case for this? For example, lets say we have a document called Article which we store in some DB and is represented using a non-pydantic model, say, an elastic-dsl document. We then need to store in our pydantic object a reference to an id for this article. This value should not be some mock value, but must rather be an actual id passed to the factory. Thus, we can define this attribute as required:

from pydantic import BaseModel
from pydantic_factories import ModelFactory, Require
from uuid import UUID

class ArticleProxy(BaseModel):
    article_id: UUID

class ArticleProxyFactory(ModelFactory):
    __model__ = ArticleProxy

    article_id = Require()

If we call without passing a value for article_id, an error will be raised.


ModelFactory has four persistence methods:

  • .create_sync(**kwargs) - builds and persists a single instance of the factory's model synchronously
  • .create_batch_sync(size: int, **kwargs) - builds and persists a list of size n instances synchronously
  • .create_async(**kwargs) - builds and persists a single instance of the factory's model asynchronously
  • .create_batch_async(size: int, **kwargs) - builds and persists a list of size n instances asynchronously

To use these methods, you must first specify a sync and/or async persistence handlers for the factory:

from pydantic_factories import ModelFactory
from typing import TypeVar, List

from pydantic import BaseModel
from pydantic_factories import SyncPersistenceProtocol, AsyncPersistenceProtocol

T = TypeVar("T", bound=BaseModel)

class SyncPersistenceHandler(SyncPersistenceProtocol[T]):
    def save(self, data: T) -> T:
        ...  # do stuff

    def save_many(self, data: List[T]) -> List[T]:
        ...  # do stuff

class AsyncPersistenceHandler(AsyncPersistenceProtocol[T]):
    async def save(self, data: T) -> T:
        ...  # do stuff

    async def save_many(self, data: List[T]) -> List[T]:
        ...  # do stuff

class PersonFactory(ModelFactory):
    __sync_persistence__ = SyncPersistenceHandler
    __async_persistence__ = AsyncPersistenceHandler

Or create your own base factory and reuse it in your various factories:

from pydantic_factories import ModelFactory
from typing import TypeVar, List

from pydantic import BaseModel
from pydantic_factories import SyncPersistenceProtocol, AsyncPersistenceProtocol

T = TypeVar("T", bound=BaseModel)

class SyncPersistenceHandler(SyncPersistenceProtocol[T]):
    def save(self, data: T) -> T:
        ...  # do stuff

    def save_many(self, data: List[T]) -> List[T]:
        ...  # do stuff

class AsyncPersistenceHandler(AsyncPersistenceProtocol[T]):
    async def save(self, data: T) -> T:
        ...  # do stuff

    async def save_many(self, data: List[T]) -> List[T]:
        ...  # do stuff

class BaseModelFactory(ModelFactory):
    __sync_persistence__ = SyncPersistenceHandler
    __async_persistence__ = AsyncPersistenceHandler

class PersonFactory(BaseModelFactory):

With the persistence handlers in place, you can now use all persistence methods. Please note - you do not need to define any or both persistence handlers. If you will only use sync or async persistence, you only need to define the respective handler to use these methods.

Create Factory Method

If you prefer to create a factory imperatively, you can do so using the ModelFactory.create_factory method. This method receives the following arguments:

  • model - the model for the factory.
  • base - an optional base factory class. Defaults to the factory class on which the method is called.
  • kwargs - a dictionary of arguments correlating to the class vars accepted by ModelFactory, e.g. faker.

You could also override the child factory's __model__ attribute to specify the model to use and the default kwargs as shown as the BuildPet class as shown below:

from datetime import date, datetime
from enum import Enum
from pydantic import BaseModel, UUID4
from typing import Any, Dict, List, TypeVar, Union, Generic, Optional
from pydantic_factories import ModelFactory

class Species(str, Enum):
    CAT = "Cat"
    DOG = "Dog"

class PetBase(BaseModel):
    name: str
    species: Species

class Pet(PetBase):
    id: UUID4

class PetCreate(PetBase):

class PetUpdate(PetBase):

class PersonBase(BaseModel):
    name: str
    hobbies: List[str]
    age: Union[float, int]
    birthday: Union[datetime, date]
    pets: List[Pet]
    assets: List[Dict[str, Dict[str, Any]]]

class PersonCreate(PersonBase):

class Person(PersonBase):
    id: UUID4

class PersonUpdate(PersonBase):

def test_factory():
    class PersonFactory(ModelFactory):
        __model__ = Person

    person =

    assert person.pets != []

ModelType = TypeVar("ModelType", bound=BaseModel)
CreateSchemaType = TypeVar("CreateSchemaType", bound=BaseModel)
UpdateSchemaType = TypeVar("UpdateSchemaType", bound=BaseModel)

class BUILDBase(Generic[ModelType, CreateSchemaType, UpdateSchemaType]):
    def __init__(
        model: ModelType = None,
        create_schema: Optional[CreateSchemaType] = None,
        update_schema: Optional[UpdateSchemaType] = None,
        self.model = model
        self.create_model = create_schema
        self.update_model = update_schema

    def build_object(self) -> ModelType:
        object_Factory = ModelFactory.create_factory(self.model)

    def build_create_object(self) -> CreateSchemaType:
        object_Factory = ModelFactory.create_factory(self.create_model)

    def build_update_object(self) -> UpdateSchemaType:
        object_Factory = ModelFactory.create_factory(self.update_model)

class BUILDPet(BUILDBase[Pet, PetCreate, PetUpdate]):
    def build_object(self) -> Pet:
        object_Factory = ModelFactory.create_factory(self.model, name="Fido")

    def build_create_object(self) -> PetCreate:
        object_Factory = ModelFactory.create_factory(self.create_model, name="Rover")

    def build_update_object(self) -> PetUpdate:
        object_Factory = ModelFactory.create_factory(self.update_model, name="Spot")

def test_factory_create():
    person_factory = BUILDBase(Person, PersonCreate, PersonUpdate)

    pet_factory = BUILDPet(Pet, PetCreate, PetUpdate)

    create_person = person_factory.build_create_object()
    update_person = person_factory.build_update_object()

    pet = pet_factory.build_object()
    create_pet = pet_factory.build_create_object()
    update_pet = pet_factory.build_update_object()

    assert create_person is not None
    assert update_person is not None

    assert == "Fido"
    assert == "Rover"
    assert == "Spot"

Extensions and Third Party Libraries

Any class that is derived from pydantic's BaseModel can be used as the __model__ of a factory. For most 3rd party libraries, e.g. SQLModel, this library will work as is out of the box.

Currently, this library also includes the following extensions:


This extension includes a class called OdmanticModelFactory and it can be imported from pydantic_factory.extensions. This class is meant to be used with the Model and EmbeddedModel classes exported by ODMantic, but it will also work with regular instances of pydantic's BaseModel.


This extension includes a class called BeanieDocumentFactory as well as an BeaniePersistenceHandler. Both of these can be imported from pydantic_factory.extensions. The BeanieDocumentFactory is meant to be used with the Beanie Document class, and it includes async persistence build in.


This extension includes a class called OrmarModelFactory. This class is meant to be used with the Model class exported by ormar.

Adding Factory Values

If your model has an attribute that is not supported by pydantic-factories and it depends on third party libraries, you can create your custom extension subclassing the ModelFactory, and overriding the get_mock_value method to add your logic.

from typing import Any
from pydantic_factories import ModelFactory

class CustomFactory(ModelFactory[Any]):
    """Tweak the ModelFactory to add our custom mocks."""

    def get_mock_value(cls, field_type: Any) -> Any:
        """Add our custom mock value."""
        if str(field_type) == "my_super_rare_datetime_field":
            return cls._get_faker().date_time_between()

        return super().get_mock_value(field_type)

Where cls._get_faker() is a faker instance that you can use to build your returned value.

Partial parameters random generation

Pydantic factories can randomly generate missing parameters for child factories. Example, given the following models and PersonFactory factory:

from pydantic_factories import ModelFactory
from pydantic import BaseModel

class Pet(BaseModel):
    name: str
    age: int

class Person(BaseModel):
    name: str
    pets: list[Pet]
    age: int

class PersonFactory(ModelFactory[Person]):
    __model__ = Person

When building a person without specifying the Person and pets ages, all these age fields are randomly generated:

from pydantic_factories import ModelFactory
from pydantic import BaseModel

class Pet(BaseModel):
    name: str
    age: int

class Person(BaseModel):
    name: str
    pets: list[Pet]
    age: int

class PersonFactory(ModelFactory[Person]):
    __model__ = Person

data = {
    "name": "John",
    "pets": [
        {"name": "dog"},
        {"name": "cat"},

person =**data)

  "name": "John",
  "pets": [
      "name": "dog",
      "age": 9005
      "name": "cat",
      "age": 2455
  "age": 975

Frequently Asked Questions (FAQs)

  • How does this differ from using the Hypothesis plugin for Pydantic? This library is closer to FactoryBoy than Hypothesis (intended for property based testing) by enabling you to define reusable factories. It is possible to use Hypothesis strategies to build your model via st.builds(Model), and even wrap these strategies for re-use, but if you want to share these outside of your codebase you will need to make a library.

  • Why doesn't this library use Hypothesis? Hypothesis is a large dependency with many features this library does not need and only supports a subset of Pydantic types.

Pydantic Factories uses Faker to handle some data mocking, while also incorporating its own mock data generation logic to go beyond what Faker is capable of and providing more complete support for Pydantic types than both Hypothesis and the official Pydantic Hypothesis plugin.

Finally, this library allows for more granular control of model generation and persistence.


This library is open to contributions - in fact we welcome it. Please see the contribution guide!

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pydantic-factories-1.5.1.tar.gz (35.5 kB view details)

Uploaded Source

Built Distribution

pydantic_factories-1.5.1-py3-none-any.whl (30.6 kB view details)

Uploaded Python 3

File details

Details for the file pydantic-factories-1.5.1.tar.gz.

File metadata

  • Download URL: pydantic-factories-1.5.1.tar.gz
  • Upload date:
  • Size: 35.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.14 CPython/3.9.13 Linux/5.15.0-1014-azure

File hashes

Hashes for pydantic-factories-1.5.1.tar.gz
Algorithm Hash digest
SHA256 16e5aaff78e751d1c560c40af7d01f85281d2fa7abf22b62fdfa1417a4faa501
MD5 6b97b5529d23a68097a228e48fd5ecf9
BLAKE2b-256 97e60f05a7e576f60dfb8bfbe94d81433f3c6068c06a15216912d61f66f261dd

See more details on using hashes here.

File details

Details for the file pydantic_factories-1.5.1-py3-none-any.whl.

File metadata

  • Download URL: pydantic_factories-1.5.1-py3-none-any.whl
  • Upload date:
  • Size: 30.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.14 CPython/3.9.13 Linux/5.15.0-1014-azure

File hashes

Hashes for pydantic_factories-1.5.1-py3-none-any.whl
Algorithm Hash digest
SHA256 9f03f71ff8be3a168578feccb2f239f54b6cca396545197884859a8ff4f3cf71
MD5 1ea339f6afaa1ebfb83574f648ad5650
BLAKE2b-256 7fe5526eefbdca213dc631052e0641a3c6b9aaac0ae43d82f12fe0b955a720ae

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page