Skip to main content

A python library for error generation in dataset for machine learning

Project description

pucktrick

Pucktrick is a Python library that provides various utility functions to introduce errors in your dataframe. The name of library is based on Puck. Puck is the name of the elf in the “A midsummer Night’s dream” of William Shakespeare that is very famous to enjoys causing trouble and playing tricks on mortals and other fairies alike.

Features

Pucktrick is organized in modules, one for error type. Each module inludes a function called with the name of module. Every function receives as parameters the dataset to modify, the strategy, and the original dataset, if mode="extended". Functions return two parameters an error (possibile 0 or 1) and the generated dataset.

Version

version 0.5

  • add strategy, a json file where it is possibile to create a error model by specifing the affected features (from one to many), a selection criteria, a bolean predicate that specify a subset of the rows to be corrupted, the mode, the percentage, the distribution function for injection errors.

version 0.4

  • errortype added: missing values

version 0.3 -error type added: duplicated

version 0.2

  • error type inserted: outliers

version 0.1

  • error type inserted: noisy error and inconsistency labels

Installation

You can install pucktrick using pip:

pip install pucktrick

References

Contributing

We welcome contributions from the community. To contribute:

Fork the repository Create a new branch (git checkout -b feature/your-feature) Commit your changes (git commit -am 'Add new feature') Push to the branch (git push origin feature/your-feature) Create a new Pull Request Please ensure your code adheres to our coding standards and includes appropriate tests.

License This project is licensed under the Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) - see the LICENSE file for details.

Acknowledgements Thanks to the contributors and open-source community for their support.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pucktrick-0.5.1.1.tar.gz (8.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pucktrick-0.5.1.1-py3-none-any.whl (6.5 kB view details)

Uploaded Python 3

File details

Details for the file pucktrick-0.5.1.1.tar.gz.

File metadata

  • Download URL: pucktrick-0.5.1.1.tar.gz
  • Upload date:
  • Size: 8.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.5

File hashes

Hashes for pucktrick-0.5.1.1.tar.gz
Algorithm Hash digest
SHA256 20768f487adf12472edb370544c0fd7ac114672638d4595563cce971048aa007
MD5 b449a0004c41b7436de1da6fdb47bd53
BLAKE2b-256 8133389dfbe32a87b09c5c473a2e50b20255acba136e1384954b22962ea270dc

See more details on using hashes here.

File details

Details for the file pucktrick-0.5.1.1-py3-none-any.whl.

File metadata

  • Download URL: pucktrick-0.5.1.1-py3-none-any.whl
  • Upload date:
  • Size: 6.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.5

File hashes

Hashes for pucktrick-0.5.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 fe68338683234c5a5bbfb7ebe4ad528f02747fc80c8001e5e26c899f56edceba
MD5 a6fa7030be5f3216cf1c10e04c6067f8
BLAKE2b-256 d64e75cb689026c18cafc20941d070e70ae1385ed9e0dcec4ba63075acbdbaf4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page