Skip to main content

Test Driven Data Analysis

Project description

What is it?

The TDDA Python module provides command-line and Python API support for the overall process of data analysis, through the following tools:

  • Reference Testing: extensions to unittest and pytest for managing testing of data analysis pipelines, where the results are typically much larger, and more complex, than single numerical values.

  • Constraints: tools (and API) for discovery of constraints from data, for validation of constraints on new data, and for anomaly detection.

  • Finding Regular Expressions: tools (and API) for automatically inferring regular expressions from text data.

Documentation

http://tdda.readthedocs.io

Installation

The simplest way to install all of the TDDA Python modules is using pip:

pip install tdda

The full set of sources, including all examples, are downloadable from PyPi with:

pip download –no-binary :all: tdda

The sources are also publicly available from Github:

git clone git@github.com:tdda/tdda.git

Documentation is available at http://tdda.readthedocs.io.

If you clone the Github repo, use

python setup.py install

afterwards to install the command-line tools (tdda and rexpy).

Reference Tests

The tdda.referencetest library is used to support the creation of reference tests, based on either unittest or pytest.

These are like other tests except:

  1. They have special support for comparing strings to files and files to files.

  2. That support includes the ability to provide exclusion patterns (for things like dates and versions that might be in the output).

  3. When a string/file assertion fails, it spits out the command you need to diff the output.

  4. If there were exclusion patterns, it also writes modified versions of both the actual and expected output and also prints the diff command needed to compare those.

  5. They have special support for handling CSV files.

  6. It supports flags (-w and -W) to rewrite the reference (expected) results once you have confirmed that the new actuals are correct.

For more details from a source distribution or checkout, see the README.md file and examples in the referencetest subdirectory.

Constraints

The tdda.constraints library is used to ‘discover’ constraints from a (Pandas) DataFrame, write them out as JSON, and to verify that datasets meet the constraints in the constraints file.

For more details from a source distribution or checkout, see the README.md file and examples in the constraints subdirectory.

Finding Regular Expressions

The tdda repository also includes rexpy, a tool for automatically inferring regular expressions from a single field of data examples.

Resources

Resources on these topics include:

All examples, tests and code run under Python 2.7, Python 3.5 and Python 3.6.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tdda-1.0.1.tar.gz (210.0 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

tdda-1.0.1-py3-none-any.whl (271.6 kB view details)

Uploaded Python 3

tdda-1.0.1-py2-none-any.whl (271.7 kB view details)

Uploaded Python 2

File details

Details for the file tdda-1.0.1.tar.gz.

File metadata

  • Download URL: tdda-1.0.1.tar.gz
  • Upload date:
  • Size: 210.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for tdda-1.0.1.tar.gz
Algorithm Hash digest
SHA256 d9bd725dab949c5f09007f4cbf8e9fce72dbd489d3b2c9fefdbb400892b25f3d
MD5 1abea5cb618a1c856c604a15f0b48699
BLAKE2b-256 e8b58e360da268d02c6ae9cc2ca45ce9508836b79e4fb6c8ff5001bed62ac7be

See more details on using hashes here.

File details

Details for the file tdda-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: tdda-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 271.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for tdda-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 20b62d0c5ff81556c5e44a20d5ac5e99edd46b0f01ac023eed31ac9ba52cb378
MD5 06b61d9f386d708026e633cf481eaef4
BLAKE2b-256 280c9408c443ca41753c8e53b0a68f1fb6f6d87497ea1b9aa84c06c27ead8b8c

See more details on using hashes here.

File details

Details for the file tdda-1.0.1-py2-none-any.whl.

File metadata

  • Download URL: tdda-1.0.1-py2-none-any.whl
  • Upload date:
  • Size: 271.7 kB
  • Tags: Python 2
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for tdda-1.0.1-py2-none-any.whl
Algorithm Hash digest
SHA256 81a48cbfb2ab49d357444e5d562eaa2a26a665c72c93a76d68ffef8ad1a7e307
MD5 340cf882a2fd5ea19641ad1c5c3e33b5
BLAKE2b-256 80db271eb09582dd5b7538ef67b939be7e05b82df1be004b0e0c6127501e6833

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page