Badgers: bad data generators
Project description
Badgers: bad data generators
badgers is a python library for generating bad data (more precisely to augment existing data with data quality deficits such as outliers, missing values, noise, etc.). It is based upon a simple API and provides a set of generators object that can generate data quality deficits from existing data.
The full documentation is hosted here: https://fraunhofer-iese.github.io/badgers/.
For a quick-start, you can install badgers
with pip:
pip install badgers
Import badgers as any other library and start using it:
from sklearn.datasets import make_blobs
from badgers.generators.tabular_data.noise import GaussianNoiseGenerator
X, y = make_blobs()
trf = GaussianNoiseGenerator(noise_std=0.5)
Xt, yt = trf.generate(X,y)
More examples are available in the tutorials section.
The API documentation is also available in the API section.
Interested developers will find relevant information in the CONTRIBUTING.md page.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.