Health data faker
Project description
Headfake
What is Headfake?
Health data faker is a Python-based package which allows the user to use a declarative approach to create fake or test data sets. using either Python code, or a YAML or JSON-based template file.
The package can be embedded directly into Python scripts, or it can be used through a command-line script.
It takes ideas from other declarative packages (e.g. pydbgen), but also adds support for a number of additional features including statistically distributed random values; dependent fields and custom fields as well as approaches for transforming generated fields before/after the generation process.
How do I install and use Headfake?
To get started quickly, you can use pip to install it:
pip install headfake
Then test it out using one of the example YAML templates:
headfake examples/patients.yaml --no-rows=100
You should get 100 rows of generated data.
For further information on using Headfake head over to Tutorials or the Usage page.
Why would I use Headfake?
Headfake makes it simple and straightforward to generate fake or test data. It has a number of features which make this easier:
-
Support for shareable template-based config or direct Python implementation to setup and perform the data generation.
-
Embeddable data generation into projects (either using a YAML or JSON config or using Python data structures/classes).
-
Generation of names and contact details through use of the Python package Faker.
-
Randomised names can be output based on a gender field.
-
More realistic simulated data uses statistical distributions to create date of birth and also probability-based option values. Other approaches to simulate real data are also being investigated.
-
Clinical data supported includes random NHS numbers and deceased flags/date of death based on age-based odds of death.
-
Dependent fields (e.g. one field's values are dependent on the values from one or more)
-
Operation fields (e.g. combine generated values using specific operations such as add or subtract)
-
Field data can be looked up from another file using a key field, allowing re-use of patient details in a different field set.
-
A selection of fields to handle generation of different types of data.
-
Ability to create and use custom fields to generate your own data types and values
-
Support for transformers which pre or post-process data once it's been generated
Is Headfake being actively maintained?
Yes, we are using Headfake is our own projects and as result are keep it maintained and adding new features when we need them.
Is Headfake suitable for my project?
The library has been released under an MIT license so can be embedded into your own tools with minimal restrictions on use.
If I use Headfake to generate data in my research project which source should I cite?
We are working on a journal paper, for now please cite the Zenodo record.
Where can I get more information?
The documentation for the package can be found on the documentation site
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file headfake-1.1.1.tar.gz
.
File metadata
- Download URL: headfake-1.1.1.tar.gz
- Upload date:
- Size: 56.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.11.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 30884b29d880187fd7bfe8447f5b46167f7c7d406ba27a2c40cbf5478c4e77f1 |
|
MD5 | 426a53862b198d8d7592cdb34037f661 |
|
BLAKE2b-256 | ef136058cba54b0ea81e769496ae557c5f93ef817853c0a97b4ad10d1392aeb5 |
File details
Details for the file headfake-1.1.1-py3-none-any.whl
.
File metadata
- Download URL: headfake-1.1.1-py3-none-any.whl
- Upload date:
- Size: 47.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.11.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 868e51007853d43def82cc6ed5e069fc004d8973ab16d5d26245673b69fd5822 |
|
MD5 | 3b095a173dc91c22a0b4fd662847ce8f |
|
BLAKE2b-256 | bed24a0a85757292c92d72544f7b951c12c589f8b24d7daab7aea1f24799ca92 |