Skip to main content

A python library for error generation in dataset for machine learning

Project description

pucktrick

Pucktrick is a Python library that provides various utility functions to introduce errors in your dataframe. The name of library is based on Puck. Puck is the name of the elf in the “A midsummer Night’s dream” of William Shakespeare that is very famous to enjoys causing trouble and playing tricks on mortals and other fairies alike.

Features

Pucktrick is organized in modules, one for error type. Each module inludes a function called with the name of module. Every function receives as parameters the dataset to modify, the strategy, and the original dataset, if mode="extended". Functions return two parameters an error (possibile 0 or 1) and the generated dataset.

Version

version 0.5

  • add strategy, a json file where it is possibile to create a error model by specifing the affected features (from one to many), a selection criteria, a bolean predicate that specify a subset of the rows to be corrupted, the mode, the percentage, the distribution function for injection errors.

version 0.4

  • errortype added: missing values

version 0.3 -error type added: duplicated

version 0.2

  • error type inserted: outliers

version 0.1

  • error type inserted: noisy error and inconsistency labels

Installation

You can install pucktrick using pip:

pip install pucktrick

References

Contributing

We welcome contributions from the community. To contribute:

Fork the repository Create a new branch (git checkout -b feature/your-feature) Commit your changes (git commit -am 'Add new feature') Push to the branch (git push origin feature/your-feature) Create a new Pull Request Please ensure your code adheres to our coding standards and includes appropriate tests.

License This project is licensed under the Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) - see the LICENSE file for details.

Acknowledgements Thanks to the contributors and open-source community for their support.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pucktrick-0.5.1.tar.gz (16.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pucktrick-0.5.1-py3-none-any.whl (17.6 kB view details)

Uploaded Python 3

File details

Details for the file pucktrick-0.5.1.tar.gz.

File metadata

  • Download URL: pucktrick-0.5.1.tar.gz
  • Upload date:
  • Size: 16.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.5

File hashes

Hashes for pucktrick-0.5.1.tar.gz
Algorithm Hash digest
SHA256 f5b0e70c86aae58ca3a15b18cbf95a30719ff2c9661c03fae06f55b974d688e1
MD5 797f251c034668323f50eddd1e1c813a
BLAKE2b-256 b50820fee845080b2e64bf229a82bee5c46ee1c6440aac331545637a20edfe80

See more details on using hashes here.

File details

Details for the file pucktrick-0.5.1-py3-none-any.whl.

File metadata

  • Download URL: pucktrick-0.5.1-py3-none-any.whl
  • Upload date:
  • Size: 17.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.5

File hashes

Hashes for pucktrick-0.5.1-py3-none-any.whl
Algorithm Hash digest
SHA256 799d5cf59db111a861e645042e743b5c9c4a817472183c1051ecda74ab1bdedf
MD5 f07ff1534501819b1462a12a1213d376
BLAKE2b-256 e2a71fbcbe94a55cccaca17774b46b00b27f63b23641150ab5a4310d2de669dd

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page