Generate fake data conforming to a Table Schema
Project description
Generate tabular fake data conforming to a Table Schema.
tsfaker library is available on PyPI.
This library was originally developed to generate a synthetic version of SNDS database, which contains hundreds of tables, hence tsfaker efficiently deals with foreign keys.
Notes :
We aim to generate fake data conforming to a schema, not fake data with realistic statistical information (see Related work section).
This library is in beta and subject to frequent changes (see Releases notes section).
Usage
Installation
$ pip3 install tsfaker
Simple usage
Generate 3 rows of fake data from a single table schema file.
$ tsfaker https://gitlab.com/healthdatahub/tsfaker/raw/master/tests/schemas/implemented_types.json --nrows 3 --pretty
boolean string number integer date datetime year yearmonth
0 1 haHoKysholbSI 9780230269.512 -7061309068 1914-10-03 1902-04-11T11:21:11Z 1939 196405
1 0 rLugGhNek 990894536.8945 2529879443 2026-09-08 2015-11-27T16:21:54Z 1932 192909
2 1 ipqVXm -4371053960.8987 -529880373 1994-09-27 1937-01-12T18:40:15Z 2021 193303
Advanced usage
Show help message.
$ tsfaker --help
Usage: tsfaker [OPTIONS] [SCHEMA_DESCRIPTORS]...
...
Download examples schemas from project schema-snds.
$ git clone https://gitlab.com/healthdatahub/schema-snds && cd schema-snds
Generate fake data for all schemas in a schemas folder using csv files in nomenclatures folder, and write them to fake_data folder.
$ mkdir fake_data
$ tsfaker schemas -o fake_data -r nomenclatures
2019-01-01 00:00:00 :: INFO :: Data generated from descriptor 'schemas/PMSI/PMSI MCO/T_MCOaa_nnE.json' will be written on 'fake_data/PMSI/PMSI MCO/T_MCOaa_nnE.csv'
2019-01-01 00:00:00 :: INFO :: Data generated from descriptor 'schemas/PMSI/PMSI MCO/T_MCOaa_nnFASTC.json' will be written on 'fake_data/PMSI/PMSI MCO/T_MCOaa_nnFASTC.csv'
2019-01-01 00:00:00 :: INFO :: Data generated from descriptor 'schemas/PMSI/PMSI SSR/T_SSRaa_nnE.json' will be written on 'fake_data/PMSI/PMSI SSR/T_SSRaa_nnE.csv'
...
Release notes
Version 0.14
[Fix] Update command line default value to match Click library version >=8.0
Version 0.13
[Fix] Adapt maximum default integer value to local system
Version 0.12
It is possible to specify trueValues and falseValues for boolean type (according to TableSchema standard)
Only one item is accepted in trueValues and falseValues arrays
It is possible to specify a format for types date and datetime
Version 0.11
yearmonth type does not follow ISO 8601 format ‘YYYY-MM’ and is now generated without a dash ‘YYYYMM’
Version 0.10
boolean type is implemented, default values for this type are 0 for False and 1 for True
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file tsfaker-0.14.tar.gz
.
File metadata
- Download URL: tsfaker-0.14.tar.gz
- Upload date:
- Size: 16.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/42.0.2 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/3.7.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 056ecef7c0888532b9df02e5afdaf802781bad6afca6b85bcf11d18f9e578e28 |
|
MD5 | edb6651f765894a5e91a41768bc00780 |
|
BLAKE2b-256 | 62b59d5daddb08a88ca5b9572360e5350581aad73aec0ecc874a275c5bdaa3dc |
File details
Details for the file tsfaker-0.14-py3-none-any.whl
.
File metadata
- Download URL: tsfaker-0.14-py3-none-any.whl
- Upload date:
- Size: 22.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/42.0.2 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/3.7.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a083a4820c46cd408302aa62c1f22fc098c16929cb07f0e96c93706184577d0d |
|
MD5 | 0129df562c620f581a93978c05b0a355 |
|
BLAKE2b-256 | 0b0550d57b90578d2dd65af2a7c0df6270abd1820ac3660cfeb9c01892d143de |