Generate realistic raw datasets with optional DQ issues
Project description
Generate realistic raw datasets with optional DQ issues
To install run
pip install rawdata
Basic Usage
Create a random table
import rawdata.generate
colLabel = ['Year', 'Name', 'Born']
colTypes = ['DATE', 'PEOPLE', 'PLACE']
tbl = rawdata.generate.random_table(4, colTypes, colLabel)
rawdata.generate.show_table(tbl)
> Year,name,Born
> 2013,Douglas,Scandinavia
> 1999,Hunter,Sierra Leone
> 2005,Shubha,Madagascar
Adding Errors to a table
import create
bad_string = generate.random_letters(6)
t = rawdata.create.Table(tbl, bad_string)
t.add_errors(2)
print(t.tbl)
And after adding 2 random errors there are additional spaces in Douglas, and the Born column is missing for Hunter
Year Name Born
----- --------- ----------
2013 Douglas Scandinavia
1999 Hunter
2005 Shubha Madagascar
Other functions
import rawdata.generate as generate
print('Random Number = ', generate.random_int(1,100))
> Random Number = 84
print('Random Letters = ', generate.random_letters(40))
> Random Letters = T1CElkRAGPAmWSavbDItDbFmQIvUh26SyJE58x49
print('Random Password = ', generate.generate_password())
> Random Password = peujlsmbf19966YKCX
words = generate.get_list_words()
print(len(words), ' words : ', words[500:502])
> 10739 words : ['architeuthis', 'arcsine']
places = generate.get_list_places()
print(len(places), ' places : ', places[58:60])
> 262 places : ['Brazil', 'British Virgin Islands']
More information is at https://github.com/acutesoftware/rawdata
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
rawdata-0.0.5b.zip
(578.6 kB
view hashes)