Skip to main content

Framework for creating synthetic data with realistic errors for refining data science pipelines.

Project description

Noisify

Noisify is a simple light weight library for augmenting and modifying data by adding realistic noise.

Introduction

Add some human noise (typos, things in the wrong boxes etc.)

>>> from noisify.recipes import human_error
>>> test_data = {'this': 1.0, 'is': 2, 'a': 'test!'}
>>> human_noise = human_error(5)
>>> print(list(human_noise(test_data)))
[{'a': 'tset!', 'this': 2, 'is': 1.0}]
>>> print(list(human_noise(test_data)))
[{'a': 0.0, 'this': 'test!', 'is': 2}]

Add some machine noise (gaussian noise, data collection interruptions etc.)

>>> from noisify.recipes import machine_error
>>> machine_noise = machine_error(5)
>>> print(list(machine_noise(test_data)))
[{'this': 1.12786393038729, 'is': 2.1387080616716307, 'a': 'test!'}]

If you want both, just add them together

>>> combined_noise = machine_error(5) + human_error(5)
>>> print(list(combined_noise(test_data)))
[{'this': 1.23854334573554, 'is': 20.77848220943227, 'a': 'tst!'}]

Add noise to numpy arrays

>>> import numpy as np
>>> test_array = np.arange(10)
>>> print(test_array)
[0 1 2 3 4 5 6 7 8 9]
>>> print(list(combined_noise(test_array)))
[[0.09172393 2.52539794 1.38823741 2.85571154 2.85571154 6.37596668
                  4.7135771  7.28358719 6.83600156 9.40973018]]

Read an image

>>> from PIL import Image
>>> test_image = Image.open(noisify.jpg)
>>> test_image.show()

And now with noise

>>> from noisify.recipes import human_error, machine_error
>>> combined_noise = machine_error(5) + human_error(5)
>>> for out_image in combined_noise(test_image):
...     out_image.show()

Noisify allows you to build flexible data augmentation pipelines for arbitrary objects. All pipelines are built from simple high level objects, plugged together like lego. Use noisify to stress test application interfaces, verify data cleaning pipelines, and to make your ML algorithms more robust to real world conditions.

Installation

Prerequisites

Noisify relies on Python 3.5+

Installation from pipy

$ pip install noisify

Additional Information

Full documentation is available at TODO ReadTheDocs Link.

Licence

Dstl (c) Crown Copyright 2019

Noisify is released under the MIT licence

Project details


Release history Release notifications | RSS feed

This version

1.0

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

noisify-1.0.tar.gz (15.6 kB view details)

Uploaded Source

Built Distribution

noisify-1.0-py3-none-any.whl (28.3 kB view details)

Uploaded Python 3

File details

Details for the file noisify-1.0.tar.gz.

File metadata

  • Download URL: noisify-1.0.tar.gz
  • Upload date:
  • Size: 15.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.4.2 requests/2.19.1 setuptools/40.5.0 requests-toolbelt/0.9.1 tqdm/4.30.0 CPython/3.6.6

File hashes

Hashes for noisify-1.0.tar.gz
Algorithm Hash digest
SHA256 325d56b8760016ca14f848e6e51b365c0d8bb2cb57b75f5fd3c00229c6b04356
MD5 2ba26f2f13e86c24fe96c8ad260db1ec
BLAKE2b-256 01d34eb7a7bdff6f276d193b02cea731956406401c6a24a23b9edad20a0623fd

See more details on using hashes here.

File details

Details for the file noisify-1.0-py3-none-any.whl.

File metadata

  • Download URL: noisify-1.0-py3-none-any.whl
  • Upload date:
  • Size: 28.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.4.2 requests/2.19.1 setuptools/40.5.0 requests-toolbelt/0.9.1 tqdm/4.30.0 CPython/3.6.6

File hashes

Hashes for noisify-1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d15065b12d7e2d913b48b896988ffadb6f7a3afd73792930fe5746eb30bbf98d
MD5 faf72c20f225443f424480a658f15f25
BLAKE2b-256 38c8e8e20087b119c569332a1d3c5994355734985ea298ceb9df5315b3804278

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page