Skip to main content

A command line tool to allow the testing of datasets

Project description

The json file (test_definitions.json) contains the configuration of the data elements and the tests that need to be executed.

There are 2 main types of connections:
* Database connections
* File connections (this will be subdivided into local and S3)

The data definition defines one of 3 things:
* A database table
* A file (csv or parquet)
* A database query

The tests define the tests that can be executed. Currently there are 2 types of tests implemented:
* Uniqueness - check for the uniqueness of a field
* Foreign Key constraint - check for a key not existing


## Requirements for the current test file

* Local installation of spark
* Postgres and table setup


## Execution

* Fix this part of the tests_processor.py script according to your installation of Dtest
```
sys.path.append('../../../dtest')
from dtest.dtest import Dtest
```

* `python tests_processor.py ./tests/test_definitions.json`


## TODO

- [x] add timing calculation to the execution of the test
- [ ] complete Dtest integration to the suite (sending the message)
- [ ] add code tests
- [ ] remove username and password from test file
- [ ] add a score function test against two variables from two data sets
- [ ] filter : a number is out of range (e.g. mileage < 0)
- [ ] count of yesterday's record > today + 10%
- [x] count of null fields > amount
- [ ] clean up code
- [ ] create generic sql test
- [ ] cross environment test execution (e.g. a table in a database and a file in parquet)
```
"raw-query-test-example" : {
"description" : "NOT IMPLEMENTED!! example of a raw sql test",
"test_type" : "custom_sql",
"table" : "cinema-file",
"sql_code" : "select count(1) error_cells from cinema where cinema_id < 1000",
"validation" : "df['error_cells] < 100"
}
```

Project details


Release history Release notifications

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Filename, size & hash SHA256 hash help File type Python version Upload date
testaton-0.1.1-py3-none-any.whl (7.6 kB) Copy SHA256 hash SHA256 Wheel py3
testaton-0.1.1.tar.gz (6.0 kB) Copy SHA256 hash SHA256 Source None

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN SignalFx SignalFx Supporter DigiCert DigiCert EV certificate StatusPage StatusPage Status page