Skip to main content

A python library for error generation in dataset for machine learning

Project description

pucktrick

Pucktrick is a Python library that provides various utility functions to introduce errors in your dataframe. The name of library is based on Puck. Puck is the name of the elf in the “A midsummer Night’s dream” of William Shakespeare that is very famous to enjoys causing trouble and playing tricks on mortals and other fairies alike.

Features

Pucktrick is organized in modules, one for error type. Each module inludes a function called with the name of module. Every function receives as parameters the dataset to modify, the strategy, and the original dataset, if mode="extended". Functions return two parameters an error (possibile 0 or 1) and the generated dataset.

Version

version 0.5

  • add strategy, a json file where it is possibile to create a error model by specifing the affected features (from one to many), a selection criteria, a bolean predicate that specify a subset of the rows to be corrupted, the mode, the percentage, the distribution function for injection errors.

version 0.4

  • errortype added: missing values

version 0.3 -error type added: duplicated

version 0.2

  • error type inserted: outliers

version 0.1

  • error type inserted: noisy error and inconsistency labels

Installation

You can install pucktrick using pip:

pip install pucktrick

References

Contributing

We welcome contributions from the community. To contribute:

Fork the repository Create a new branch (git checkout -b feature/your-feature) Commit your changes (git commit -am 'Add new feature') Push to the branch (git push origin feature/your-feature) Create a new Pull Request Please ensure your code adheres to our coding standards and includes appropriate tests.

License This project is licensed under the Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) - see the LICENSE file for details.

Acknowledgements Thanks to the contributors and open-source community for their support.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pucktrick-0.5.1.2.tar.gz (17.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pucktrick-0.5.1.2-py3-none-any.whl (18.9 kB view details)

Uploaded Python 3

File details

Details for the file pucktrick-0.5.1.2.tar.gz.

File metadata

  • Download URL: pucktrick-0.5.1.2.tar.gz
  • Upload date:
  • Size: 17.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.5

File hashes

Hashes for pucktrick-0.5.1.2.tar.gz
Algorithm Hash digest
SHA256 72cfb8ed4b85bc33fa9cf90cb67f895fd6dc961bd234c941de709b9d78b85dca
MD5 04e77613002aecb24417301a5cba3e64
BLAKE2b-256 aa955c99b205bd9e6a1f964ae1a8b4d2d2da4072c7181c70d4c7aadf4ca8aeea

See more details on using hashes here.

File details

Details for the file pucktrick-0.5.1.2-py3-none-any.whl.

File metadata

  • Download URL: pucktrick-0.5.1.2-py3-none-any.whl
  • Upload date:
  • Size: 18.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.5

File hashes

Hashes for pucktrick-0.5.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 b69766aca0e7b2e6904be9e2dc779c7e530bdfcf4a917947b1bcb00facc98eae
MD5 293b0fb958e8dd4a88a1ba8c6530e228
BLAKE2b-256 c14c8712b94e959383284dd2cbc12547677097e63eeacce3bf19f4048f47eaaf

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page