Skip to main content

Package to generate random transactional data

Project description

RandomDataGen - Random Data Generator Package

Code style: black Checked with mypy Downloads

This is a package to generate random transactional data. You can use this package to study Pandas operations or clustering methods like RFM.

With this package you can create a table with transactional data containing:

  • consumer_id: ID identifying the customer that does the transaction;
  • transaction_created_at: Date of transaction;
  • transaction_payment_value: Monetary value of transaction.

All the fields are customizable.

How the data is generated

The consumer_id field is generated by a range function, returning a sequence of integers from 1 to n_consumers:

consumer_ids = range(1, n_consumers + 1)

The transaction_created_at field is generated by a Pandas function called date_range. You can view more about this functions in this link:

created_at_list = list(pd.date_range(start=first_transaction_date, end=last_transaction_date, periods=n_rows)

The transaction_payment_value is sample from a normal distribution with mean equals the mean_spend parameter and the stardand deviation equals the std_spend parameter:

list(np.random.normal(transaction_mean_value, transaction_std_value, n_rows))

How to use

You can start the use of RandomDataGen with this example code:

from random_data_gen.data_generator import TransactionalDataGenerator

TRGenerator = TransactionalDataGenerator(
    n_rows=1000,
    n_consumers=100,
    transaction_mean_value=100,
    transaction_std_value=10,
    first_transaction_date="2020-01-01",
    last_transaction_date="2021-01-01",
)

df = TRGenerator()

In this snippet we defined a dataframe with 1000 rows, 100 unique users, a mean spend in transactions of 100u.m., a standard deviation in transactional spend of 10u.m., the first transaction date (2020-01-01) and the last transaction date (2021-01-01).

The dataframe returned is in the form:

| consumer_id |     transaction_created_at    | transaction_payment_value |
|:-----------:|:-----------------------------:|:-------------------------:|
|     234     | 2020-01-01 00:00:00.000000000 |           120.10          |
|      43     | 2020-01-01 08:47:34.054054054 |           87.10           |
|     321     | 2021-10-23 10:27:12.092356134 |           12.98           |
|     3123    | 2020-12-30 21:37:17.837837840 |           12.84           |

The shape of this dataframe is defined by the parameter n_rows.

Contribute

To contribute you need to install Poetry.

After installing, you need to clone this repo and run the following command:

poetry install -n

Before sending the code to the repo, you need to run:

make format

To apply the project style to the new code.

And after that, run:

make check

This command will check your code with flake8 and pytest.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

random-data-gen-0.1.3.tar.gz (3.6 kB view details)

Uploaded Source

Built Distribution

random_data_gen-0.1.3-py3-none-any.whl (3.9 kB view details)

Uploaded Python 3

File details

Details for the file random-data-gen-0.1.3.tar.gz.

File metadata

  • Download URL: random-data-gen-0.1.3.tar.gz
  • Upload date:
  • Size: 3.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.11 CPython/3.8.1 Linux/5.13.0-27-generic

File hashes

Hashes for random-data-gen-0.1.3.tar.gz
Algorithm Hash digest
SHA256 eda4adff065a41c8c5216b9c93c84a2843b561b0830500748d6d69023fccf372
MD5 218a53e067d4aa07b049987678e7383d
BLAKE2b-256 d3d9579f4f0b045445914f855d5d71e5ffd3adf086367eab605425be35919d5e

See more details on using hashes here.

File details

Details for the file random_data_gen-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: random_data_gen-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 3.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.11 CPython/3.8.1 Linux/5.13.0-27-generic

File hashes

Hashes for random_data_gen-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 7f8175101acf5c62b9d9c3017a7ec7a472d0c27e144479530200c52fa219ebee
MD5 c019c126bba9c6417da4e215fc1c0c33
BLAKE2b-256 be1826ca777f9de9af4c600b5b7034c6f51d23269d84a253804909a04be30662

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page