Skip to main content

Synthetic dataset creator/augmentor for machine learning applications.

Project description

SyntheticGen v0.1.1

This is a Python package available on PyPi that creates synthetic datasets for various dataset types.

SyntheticGen currently supports:

  • Linear Synthetic Datasets

  • Linear Augmentations

Installation

pip install syntheticgen

Usage

In v0.1.1 there is only a linear augmentor available however, image and experimental augmentation techniques will be available soon.


#Import a dataset as a numpy array (in this example it is named dataset)

#Create an augmentor object and initialize it with the dataset

aug = linearAugmentor(dataset)





#addMatrix() and setPowerRange() will configure the augmentor with custom specifications.

aug.addMatrix(.25)

aug.setPowerRange(2,2)



#performOperations() will run the augmentation operations with the set configuration.

aug.performOperation()



#Returns the augmented dataset

dataset = aug.getCombinedSet()

Object and Method Details

linearAugmentor(dataset):\

This is the augmentor object for psuedo-linear datasets.

Dataset Specifications For Proper Function:

  • Must be a numpy array with sub-lists holding inputs and outputs in the same order every line.

  • Each sub-list in the dataset must be of equal length to one another

linearAugmentor.addMatrix(percentAdded):\

This method will randomly add a custom percentage of the inputed dataset into a matrix to then be augmented.

Parameters:

  • percentAdded

    • This is a float value that determines the percentage of the inputted dataset that will be augmented.

    • Minimum to function: whatever float equates to 2 lines of the dataset.

    • Maximum is 1.0, the entire dataset.

linearAugmentor.setIntRange(lowerBound, upperBound):\

This method will set an integer range for the number of operations to be performed on the matrix.

Parameters:

  • lowerBound

    • This is the fewest possible operations to be performed on the inputted dataset's matrix.

    • Minimum: 0

    • No Maximum

  • upperBound

    • This is the maximum number of possible operations to be performed on the inputted dataset's matrix.

    • Minimum: 0

    • No Maximum

linearAugmentor.setPowerRange(lowerBound, upperBound):\

This method will set a range for the number of operations to be performed on the matrix based on the length of the matrix raised to a power (the bounds).

Parameters:

  • lowerBound

    • This is the value the length of the matrix will be raised by creating the fewest possible operations to be performed.

    • Minimum: 0

    • No Maximum

  • upperBound

    • This is the value the length of the matrix will be raised by creating the maximum number of possible operations to be performed.

    • Minimum: 0

    • No Maximum

linearAugmentor.performOperations():\

This is the method that will perform the operations on the matrix given the inputted specifications.

linearAugmentor.getCombinedSet():\

This method returns the original dataset randomly combined with the newly created augmented data.

linearAugmentor.getSyntheticData()\

This method returns only the augmented data creating a fully synthetic dataset.

linearAugmentor.getInitialDataset()\

This method returns the originally inputted, unchanged dataset.

Roadmap

Linear Augmentor

  • Add additional customization options pertaining to the type of operations performed

  • Add more advanced augmentation techniques utilizing eingenvectors.

Image Augmentor

  • Add an image augmentor that performs multiple augmentation techniques.

Differential Augmentor

  • Experimental augmentor that's currently in production utilizing differential calculus.

Recent Changes

v0.1.1

  • README changes.

v0.1.0

  • Created package and added the linear augmentor.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

syntheticgen-0.1.0.tar.gz (4.7 kB view details)

Uploaded Source

Built Distribution

SyntheticGen-0.1.0-py3-none-any.whl (4.7 kB view details)

Uploaded Python 3

File details

Details for the file syntheticgen-0.1.0.tar.gz.

File metadata

  • Download URL: syntheticgen-0.1.0.tar.gz
  • Upload date:
  • Size: 4.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.4

File hashes

Hashes for syntheticgen-0.1.0.tar.gz
Algorithm Hash digest
SHA256 a74e77db367568659867ebe89bfe0fb8cc8c5139e84ac9784545130386346777
MD5 9268337af321f2b4b44ed00280e4352c
BLAKE2b-256 179f44cb5df42c1100998e0a4db79d150f727b294d7ef5672bb99926832825cb

See more details on using hashes here.

File details

Details for the file SyntheticGen-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for SyntheticGen-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 1583b7e7467550af3ca48523576dae18aa4996b445de6633b9fffda6ddf5caa2
MD5 213cdfd82b8f9f9b48dadf8d4c4bbb28
BLAKE2b-256 322ca195b8d52b55ed41269f5c85de77975cd1004cadfb853f3a146f0a1bc4e8

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page