Framework for creating synthetic data with realistic errors for refining data science pipelines.
Project description
Noisify
Noisify is a simple light weight library for augmenting and modifying data by adding realistic noise.
Introduction
Add some human noise (typos, things in the wrong boxes etc.)
>>> from noisify.recipes import human_error
>>> test_data = {'this': 1.0, 'is': 2, 'a': 'test!'}
>>> human_noise = human_error(5)
>>> print(list(human_noise(test_data)))
[{'a': 'tset!', 'this': 2, 'is': 1.0}]
>>> print(list(human_noise(test_data)))
[{'a': 0.0, 'this': 'test!', 'is': 2}]
Add some machine noise (gaussian noise, data collection interruptions etc.)
>>> from noisify.recipes import machine_error
>>> machine_noise = machine_error(5)
>>> print(list(machine_noise(test_data)))
[{'this': 1.12786393038729, 'is': 2.1387080616716307, 'a': 'test!'}]
If you want both, just add them together
>>> combined_noise = machine_error(5) + human_error(5)
>>> print(list(combined_noise(test_data)))
[{'this': 1.23854334573554, 'is': 20.77848220943227, 'a': 'tst!'}]
Add noise to numpy arrays
>>> import numpy as np
>>> test_array = np.arange(10)
>>> print(test_array)
[0 1 2 3 4 5 6 7 8 9]
>>> print(list(combined_noise(test_array)))
[[0.09172393 2.52539794 1.38823741 2.85571154 2.85571154 6.37596668
4.7135771 7.28358719 6.83600156 9.40973018]]
Read an image
>>> from PIL import Image
>>> test_image = Image.open(noisify.jpg)
>>> test_image.show()
And now with noise
>>> from noisify.recipes import human_error, machine_error
>>> combined_noise = machine_error(5) + human_error(5)
>>> for out_image in combined_noise(test_image):
... out_image.show()
Noisify allows you to build flexible data augmentation pipelines for arbitrary objects. All pipelines are built from simple high level objects, plugged together like lego. Use noisify to stress test application interfaces, verify data cleaning pipelines, and to make your ML algorithms more robust to real world conditions.
Installation
Prerequisites
Noisify relies on Python 3.5+
Installation from pipy
$ pip install noisify
Additional Information
Full documentation is available at TODO ReadTheDocs Link.
Licence
Dstl (c) Crown Copyright 2019
Noisify is released under the MIT licence
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file noisify-1.0.tar.gz
.
File metadata
- Download URL: noisify-1.0.tar.gz
- Upload date:
- Size: 15.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.13.0 pkginfo/1.4.2 requests/2.19.1 setuptools/40.5.0 requests-toolbelt/0.9.1 tqdm/4.30.0 CPython/3.6.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 325d56b8760016ca14f848e6e51b365c0d8bb2cb57b75f5fd3c00229c6b04356 |
|
MD5 | 2ba26f2f13e86c24fe96c8ad260db1ec |
|
BLAKE2b-256 | 01d34eb7a7bdff6f276d193b02cea731956406401c6a24a23b9edad20a0623fd |
File details
Details for the file noisify-1.0-py3-none-any.whl
.
File metadata
- Download URL: noisify-1.0-py3-none-any.whl
- Upload date:
- Size: 28.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.13.0 pkginfo/1.4.2 requests/2.19.1 setuptools/40.5.0 requests-toolbelt/0.9.1 tqdm/4.30.0 CPython/3.6.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d15065b12d7e2d913b48b896988ffadb6f7a3afd73792930fe5746eb30bbf98d |
|
MD5 | faf72c20f225443f424480a658f15f25 |
|
BLAKE2b-256 | 38c8e8e20087b119c569332a1d3c5994355734985ea298ceb9df5315b3804278 |