Skip to main content

Visualize data quality

Project description

Python Package

vizdataquality

This is a Python package for visualizing data quality, and includes this six-step workflow:

  1. Look at your data (is anything obviously wrong?)
  2. Watch out for special values
  3. Is any data missing?
  4. Check each variable
  5. Check combinations of variables
  6. Profile the cleaned data

Documentation

The vizdataquality documentation is hosted on Read the Docs.

Installation

We recommend installing vizdataquality in a python virtual environment or Conda environment.

To install vizdataquality, most users should run:

pip install 'vizdataquality'

Tutorials

The package includes notebooks that show you how to:

After installing vizdataquality, to follow theses tutorials interactively you will need to clone or download this repository. Then start jupyter from within it:

python -m jupyter notebook notebooks

Notice

The vizdataquality software is released under the Apache Licence, version 2.0. See LICENCE for details.

Acknowledgements

The development of the vizdataquality software was supported by funding from the Engineering and Physical Sciences Research Council (EP/N013980/1; EP/R511717/1) and the Alan Turing Institute.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vizdataquality-0.1.0.tar.gz (29.6 kB view hashes)

Uploaded Source

Built Distribution

vizdataquality-0.1.0-py3-none-any.whl (32.2 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page