a tool for quickly generating dummy data
Project description
DataBuilder
Have you ever needed some dummy data to demonstrate some basic data analysis / machine learning topics?
DataBuilder can save you time by creating customized dummy data sets within minutes.
Installation
pip install databuilder
Quick Example
import databuilder as db
# make a dummy dataset about "our employees"
config = {
'fields': {
'empID': db.ID(),
'first_name': db.Name(first_only=True),
'last_name': db.Name(last_only=True),
'department': db.Group(["Sales", "Acct", "Mktg", "IT"]),
'salary': db.NormalDist(50000, 10000),
'hire_date': db.Date("1990-01-01", "2020-12-31")
}
}
# create a Pandas DataFrame with
# the fields defined in `config`
df = db.create_df(config, n=200)
print(df.head(2))
#
# Example output:
# empID first_name last_name department salary hire_date
# 0 1 Frank Ward IT 69210 2004-05-05
# 1 2 Barbara George Mktg 46744 2019-05-20
Complete Usage Guide
Detailed docs on how to use DataBuilder can be found in the docs/
folder of this repo (or click here)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
databuilder-0.0.2.tar.gz
(7.9 kB
view hashes)
Built Distribution
Close
Hashes for databuilder-0.0.2-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c1955e8bc193b403789b583ff25b1384053cbbe1681af4a45c05bb79c0f95efc |
|
MD5 | dc1f72743d5a07e2e895024dedc3caa1 |
|
BLAKE2b-256 | 75d6a0dc4f359973efeacb6f0cae31139a8278d117f7bddc29d3c52e00d0f003 |