Skip to main content

Fully-controllable error generation for tabular data.

Project description

tab_err

tab_err is an implementation of a tabular data error model that disentangles error mechanism and error type. It generalizes the formalization of missing values, implying that missing values are only one of many possible error type implemented here. tab_err gives the user full control over the error generation process and allows to model realistic errors with complex dependency structures.

The building blocks are ErrorMechanisms, ErrorTypes, and ErrorModels. ErrorMechanism defines where the incorrect cells are and model realistic dependency structures and ErrorType describes in which way the value is incorrect. Together they build a ErrorModel that can be used to perturb existing data with realistic errors.

This repository offers (soon) three APIs, low-level, mid-level and high-level.

Examples

For details and examples please check out our Getting Started Notebook.

Where to get it

The source code is currently hosted on GitHub at: https://github.com/calgo-lab/tab_err

Binary installers for the latest released version are available at the Python Package Index (PyPI).

pip install tab-err

Contributing

To develop tab_err, install the uv package manager. Run tests with uv run pytest. Develop features on feature branches and open pull requests once you're ready to contribute. Make sure that your code is tested, documented, and well described in the pull request.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tab_err-0.2.1.tar.gz (26.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tab_err-0.2.1-py3-none-any.whl (39.4 kB view details)

Uploaded Python 3

File details

Details for the file tab_err-0.2.1.tar.gz.

File metadata

  • Download URL: tab_err-0.2.1.tar.gz
  • Upload date:
  • Size: 26.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for tab_err-0.2.1.tar.gz
Algorithm Hash digest
SHA256 fb18b731fa75a8d869c04e7f2b3a8ec20a674a5f466ee76ebc7f1379300b9753
MD5 545ec02b9f88964140b17e3a21a73177
BLAKE2b-256 7c538331f8beb6e065879341db98c0720b63086d6929fbdce2789fcefa55917e

See more details on using hashes here.

Provenance

The following attestation bundles were made for tab_err-0.2.1.tar.gz:

Publisher: publish.yaml on calgo-lab/tab_err

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file tab_err-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: tab_err-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 39.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for tab_err-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 1b00aed6cce04054a7bd15ec456db37c11128a4e22334ebb54a15f8932b8485e
MD5 50f2ee92ecddba71361cdfc059e419e5
BLAKE2b-256 a695b466512817f4f1cd835db81535917a5631a1793d3d9acfc2b2b595b867b4

See more details on using hashes here.

Provenance

The following attestation bundles were made for tab_err-0.2.1-py3-none-any.whl:

Publisher: publish.yaml on calgo-lab/tab_err

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page