Skip to main content

Python functions to facilitate the pre-processing of data for ML tasks in a clinical context.

Project description

# CleanDat Python functions to facilitate the pre-processing of data to prepare them for ML tasks, especially suitable for data in a clinical context.

Major functionalities include heuristic based data cleaning and feature engineering like: - Automatic detection of encoding strings (e.g. 1=m) and application of the corresponding encoding to un-encoded data of the corresponding column - Automatic detection of date strings of different formats (e.g. 2019-01-01, 01/01/2019, January 2022) and conversion to a unified format - Encoding of date strings into decomposed date features (e.g. year, month, day, weekday, etc.) - Heuristics for unification of different number formats, e.g. 1,000.00 vs. 1.000,00 or exponential notations like 1e3 vs 10x10^2 - Detection and replacement of inconsistent data values

# Setup

Install via pip:

pip install cleandat

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cleandat-0.0.3.tar.gz (10.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cleandat-0.0.3-py3-none-any.whl (13.9 kB view details)

Uploaded Python 3

File details

Details for the file cleandat-0.0.3.tar.gz.

File metadata

  • Download URL: cleandat-0.0.3.tar.gz
  • Upload date:
  • Size: 10.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.18

File hashes

Hashes for cleandat-0.0.3.tar.gz
Algorithm Hash digest
SHA256 e85c54f195429135076066ac8136391ff9e12586b1dd202e0bc4fbd06e0613ce
MD5 1ee4d2b84a8f32ded75f0aa9ec7d5dce
BLAKE2b-256 0aad2111a9159e4fa098d6253e15f5aaec992e08287266b94089bd221c6e48f5

See more details on using hashes here.

File details

Details for the file cleandat-0.0.3-py3-none-any.whl.

File metadata

  • Download URL: cleandat-0.0.3-py3-none-any.whl
  • Upload date:
  • Size: 13.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.18

File hashes

Hashes for cleandat-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 e4910f0d1907fdf00c95f4606520357c09b1d0e87e8d448b97c1f5b2f037b8f9
MD5 91ba3fc9baf2f03641a284eb7b8e480e
BLAKE2b-256 79eeb95719512cce8b823143520db0bf89c4f305f2d16de592a97db90d7feb33

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page