Skip to main content

Synthetic dataset creator/augmentor for machine learning applications.

Project description

SyntheticGen v0.1.1

This is a Python package available on PyPi that creates synthetic datasets for various dataset types.

SyntheticGen currently supports:

  • Linear Synthetic Datasets

  • Linear Augmentations

Installation

pip install syntheticgen

Usage

In v0.1.1 there is only a linear augmentor available however, image and experimental augmentation techniques will be available soon.


#Import a dataset as a numpy array (in this example it is named dataset)

#Create an augmentor object and initialize it with the dataset

aug = linearAugmentor(dataset)





#addMatrix() and setPowerRange() will configure the augmentor with custom specifications.

aug.addMatrix(.25)

aug.setPowerRange(2,2)



#performOperations() will run the augmentation operations with the set configuration.

aug.performOperation()



#Returns the augmented dataset

dataset = aug.getCombinedSet()

Object and Method Details

linearAugmentor(dataset):\

This is the augmentor object for psuedo-linear datasets.

Dataset Specifications For Proper Function:

  • Must be a numpy array with sub-lists holding inputs and outputs in the same order every line.

  • Each sub-list in the dataset must be of equal length to one another

linearAugmentor.addMatrix(percentAdded):\

This method will randomly add a custom percentage of the inputed dataset into a matrix to then be augmented.

Parameters:

  • percentAdded

    • This is a float value that determines the percentage of the inputted dataset that will be augmented.

    • Minimum to function: whatever float equates to 2 lines of the dataset.

    • Maximum is 1.0, the entire dataset.

linearAugmentor.setIntRange(lowerBound, upperBound):\

This method will set an integer range for the number of operations to be performed on the matrix.

Parameters:

  • lowerBound

    • This is the fewest possible operations to be performed on the inputted dataset's matrix.

    • Minimum: 0

    • No Maximum

  • upperBound

    • This is the maximum number of possible operations to be performed on the inputted dataset's matrix.

    • Minimum: 0

    • No Maximum

linearAugmentor.setPowerRange(lowerBound, upperBound):\

This method will set a range for the number of operations to be performed on the matrix based on the length of the matrix raised to a power (the bounds).

Parameters:

  • lowerBound

    • This is the value the length of the matrix will be raised by creating the fewest possible operations to be performed.

    • Minimum: 0

    • No Maximum

  • upperBound

    • This is the value the length of the matrix will be raised by creating the maximum number of possible operations to be performed.

    • Minimum: 0

    • No Maximum

linearAugmentor.performOperations():\

This is the method that will perform the operations on the matrix given the inputted specifications.

linearAugmentor.getCombinedSet():\

This method returns the original dataset randomly combined with the newly created augmented data.

linearAugmentor.getSyntheticData()\

This method returns only the augmented data creating a fully synthetic dataset.

linearAugmentor.getInitialDataset()\

This method returns the originally inputted, unchanged dataset.

Roadmap

Linear Augmentor

  • Add additional customization options pertaining to the type of operations performed

  • Add more advanced augmentation techniques utilizing eingenvectors.

Image Augmentor

  • Add an image augmentor that performs multiple augmentation techniques.

Differential Augmentor

  • Experimental augmentor that's currently in production utilizing differential calculus.

Recent Changes

v0.1.1

  • README changes.

v0.1.0

  • Created package and added the linear augmentor.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

syntheticgen-0.1.1.tar.gz (4.7 kB view details)

Uploaded Source

Built Distribution

SyntheticGen-0.1.1-py3-none-any.whl (4.7 kB view details)

Uploaded Python 3

File details

Details for the file syntheticgen-0.1.1.tar.gz.

File metadata

  • Download URL: syntheticgen-0.1.1.tar.gz
  • Upload date:
  • Size: 4.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.4

File hashes

Hashes for syntheticgen-0.1.1.tar.gz
Algorithm Hash digest
SHA256 f6759b659c44e0445cf5b3e434b35df5152a0de8f246225681a3a480327fcd75
MD5 d483aa9a5660a5d9d0be0145818e613a
BLAKE2b-256 24bc02696c1ff3174be224bb2c2627e548367f87e053d869bc509826248dc660

See more details on using hashes here.

File details

Details for the file SyntheticGen-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for SyntheticGen-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 d7e066752af6c5b4b985bc38e2f8cd22e556cff9edf4517cf505aba6a1533a5d
MD5 ee29d0accdafbef5a1f21f5857e1d447
BLAKE2b-256 2923a11c1039fd49d70dcf316986136c36495a3e1bdb7b98021a5cf6ffe51043

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page