Generate files with fake data.
Project description
Create files with fake data. In many formats. With no efforts.
Prerequisites
All of core dependencies of this package are MIT licensed. Most of optional dependencies of this package are MIT licensed, while a few are BSD-, Apache 2- or GPL licensed. All licenses are mentioned below between the brackets.
Core package requires Python 3.7, 3.8, 3.9, 3.10 or 3.11.
Faker (MIT) is the only required dependency.
Django (BSD) integration with factory_boy (MIT) has been tested with Django starting from version 2.2 to 4.2 (although only maintained versions of Django are currently being tested against).
DOCX file support requires python-docx (MIT).
ICO, JPEG, PNG, SVG and WEBP files support requires imgkit (MIT) and wkhtmltopdf (LGPLv3).
PDF file support requires either combination of pdfkit (MIT) and wkhtmltopdf (LGPLv3), or reportlab (BSD).
PPTX file support requires python-pptx (MIT).
ODP file support requires odfpy (Apache 2).
ODS file support requires tablib (MIT) and odfpy (Apache 2).
ODT file support requires odfpy (Apache 2).
PathyFileSystemStorage storage support requires pathy (Apache 2).
AWSS3Storage storage support requires pathy (Apache 2) and boto3 (Apache 2).
AzureCloudStorage storage support requires pathy (Apache 2) and azure-storage-blob (MIT).
GoogleCloudStorage storage support requires pathy (Apache 2) and google-cloud-storage (Apache 2).
SFTPStorage storage support requires paramiko (LGLPv2.1).
AugmentFileFromDirProvider provider requires nlpaug (MIT), PyTorch (BSD), transformers (Apache 2), numpy (BSD), pandas (BSD), tika (Apache 2) and Apache Tika (Apache 2).
Documentation
Documentation is available on Read the Docs.
For bootstrapping check the Quick start.
For various ready to use code examples see the Recipes.
For CLI options see the CLI.
For guidelines on contributing check the Contributor guidelines.
Online demos
Check the demo(s):
REST API demo (based on faker-file-api REST API)
UI frontend demo (based on faker-file-ui UI frontend)
WASM frontend demo (based on faker-file-wasm WASM frontend)
Installation
Latest stable version from PyPI
WIth all dependencies
pip install faker-file[all]
Only core
pip install faker-file
With most common dependencies
Everything, except ML libraries which are required for data augmentation only
pip install faker-file[common]
With DOCX support
pip install faker-file[docx]
With EPUB support
pip install faker-file[epub]
With images support
pip install faker-file[images]
With PDF support
pip install faker-file[pdf]
With MP3 support
pip install faker-file[mp3]
With XLSX support
pip install faker-file[xlsx]
With ODS support
pip install faker-file[ods]
With ODT support
pip install faker-file[odt]
With data augmentation support
pip install faker-file[data-augmentation]
Or development version from GitHub
pip install https://github.com/barseghyanartur/faker-file/archive/main.tar.gz
Features
Supported file types
BIN
CSV
DOCX
EML
EPUB
ICO
JPEG
MP3
ODS
ODT
ODP
PDF
PNG
RTF
PPTX
SVG
TAR
TXT
WEBP
XLSX
XML
ZIP
Additional providers
AugmentFileFromDirProvider: Make an augmented copy of randomly picked file from given directory. The following types are supported : DOCX, EML, EPUB, ODT, PDF, RTF and TXT.
GenericFileProvider: Create files in any format from raw bytes or a predefined template.
RandomFileFromDirProvider: Pick a random file from given directory.
FileFromPathProvider: File from given path.
Supported file storages
Native file system storage
AWS S3 storage
Azure Cloud Storage
Google Cloud Storage
SFTP storage
Usage examples
With Faker
One way
from faker import Faker
from faker_file.providers.txt_file import TxtFileProvider
FAKER = Faker()
file = TxtFileProvider(FAKER).txt_file()
If you just need bytes back (instead of creating the file), provide the raw=True argument (works with all provider classes and inner functions):
raw = TxtFileProvider(FAKER).txt_file(raw=True)
Or another
from faker import Faker
from faker_file.providers.txt_file import TxtFileProvider
FAKER = Faker()
FAKER.add_provider(TxtFileProvider)
file = FAKER.txt_file()
If you just need bytes back:
raw = FAKER.txt_file(raw=True)
With factory_boy
upload/models.py
from django.db import models
class Upload(models.Model):
# ...
file = models.FileField()
upload/factories.py
Note, that when using faker-file with Django and native file system storages, you need to pass your MEDIA_ROOT setting as root_path value to the chosen file storage as show below.
import factory
from django.conf import settings
from factory import Faker
from factory.django import DjangoModelFactory
from faker_file.providers.docx_file import DocxFileProvider
from faker_file.storages.filesystem import FileSystemStorage
from upload.models import Upload
FS_STORAGE = FileSystemStorage(
root_path=settings.MEDIA_ROOT,
rel_path="tmp"
)
factory.Faker.add_provider(DocxFileProvider)
class UploadFactory(DjangoModelFactory):
# ...
file = Faker("docx_file", storage=FS_STORAGE)
class Meta:
model = Upload
File storages
All file operations are delegated to a separate abstraction layer of storages.
The following storages are implemented:
FileSystemStorage: Does not have additional requirements.
PathyFileSystemStorage: Requires pathy.
AzureCloudStorage: Requires pathy and Azure related dependencies.
GoogleCloudStorage: Requires pathy and Google Cloud related dependencies.
AWSS3Storage: Requires pathy and AWS S3 related dependencies.
SFTPStorage: Requires paramiko and related dependencies.
Usage example with storages
FileSystemStorage example
Native file system storage. Does not have dependencies.
root_path: Path to the root directory. Given the example of Django, this would be the path to the MEDIA_ROOT directory. It’s important to know, that root_path will not be embedded into the string representation of the file. Only rel_path will.
rel_path: Relative path from the root directory. Given the example of Django, this would be the rest of the path to the file.
import tempfile
from faker import Faker
from faker_file.providers.txt_file import TxtFileProvider
from faker_file.storages.filesystem import FileSystemStorage
FS_STORAGE = FileSystemStorage(
root_path=tempfile.gettempdir(), # Use settings.MEDIA_ROOT for Django
rel_path="tmp",
)
FAKER = Faker()
file = TxtFileProvider(FAKER).txt_file(storage=FS_STORAGE)
FS_STORAGE.exists(file)
PathyFileSystemStorage example
Native file system storage. Requires pathy.
import tempfile
from pathy import use_fs
from faker import Faker
from faker_file.providers.txt_file import TxtFileProvider
from faker_file.storages.cloud import PathyFileSystemStorage
use_fs(tempfile.gettempdir())
PATHY_FS_STORAGE = PathyFileSystemStorage(
bucket_name="bucket_name",
root_path="tmp"
rel_path="sub-tmp",
)
FAKER = Faker()
file = TxtFileProvider(FAKER).txt_file(storage=PATHY_FS_STORAGE)
PATHY_FS_STORAGE.exists(file)
AWSS3Storage example
AWS S3 storage. Requires pathy and boto3.
from faker import Faker
from faker_file.providers.txt_file import TxtFileProvider
from faker_file.storages.aws_s3 import AWSS3Storage
S3_STORAGE = AWSS3Storage(
bucket_name="bucket_name",
root_path="tmp", # Optional
rel_path="sub-tmp", # Optional
# Credentials are optional too. If your AWS credentials are properly
# set in the ~/.aws/credentials, you don't need to send them
# explicitly.
credentials={
"key_id": "YOUR KEY ID",
"key_secret": "YOUR KEY SECRET"
},
)
FAKER = Faker()
file = TxtFileProvider(FAKER).txt_file(storage=S3_STORAGE)
S3_STORAGE.exists(file)
Testing
Simply type:
pytest -vrx
Or use tox:
tox
Or use tox to check specific env:
tox -e py310-django41
Writing documentation
Keep the following hierarchy.
=====
title
=====
header
======
sub-header
----------
sub-sub-header
~~~~~~~~~~~~~~
sub-sub-sub-header
^^^^^^^^^^^^^^^^^^
sub-sub-sub-sub-header
++++++++++++++++++++++
sub-sub-sub-sub-sub-header
**************************
License
MIT
Support
For security issues contact me at the e-mail given in the Author section.
For overall issues, go to GitHub.
Citation
Please, use the following entry when citing faker-file in your research:
@software{faker-file,
author = {Artur Barseghyan},
title = {faker-file: Create files with fake data. In many formats. With no efforts.},
year = {2023},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {https://github.com/barseghyanartur/faker-file},
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for faker_file-0.16-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | df47283d24113015d839db537e410af9aa757f5a03a2fe8e9d8167102e9a1c03 |
|
MD5 | a4edec5f2adbf87a9cd8f53979ad9c4f |
|
BLAKE2b-256 | ff01b5727003de754ae7a3c61d594f5fa015b4b269d93e70357db828e614b7c7 |