APF Engine: Distributed Python Workflow configuration and execution platform
Project description
# Informative Comparison of DataFrame Differences
This is a tool that is used to compare and test Pandas DataFrames. It has three parts.
### Core function At it’s core is a function that compares a number of aspects of two dataframes.
Number of rows
Column names
dtypes
Index
Integers
Floats
Datetimes
Objects (typically strings)
NaNs
Booleans
For each dimension, if differences are found, a key is included in the diffs dictionary that is returned. The value of that key (e.g. ‘nrows’) would be a string description (e.g. ‘observed contains 5 rows. expected 10.)
### PyTest Integration
If one is testing functionality that produces a dataframe, and one can create one to compare it to…
pytest_kit.py automatically injects N separate pytests into a test module (test_<name>.py) simply by doing the following 2 things.
Create two pytest fixtures: df_observed, and df_expected
Adding the import statement from df_compare.pytest_kit import *
That’s it, when you run $ pytest test_<name>.py, it will discover and run N separate tests!
### Shell script
This comparison can be used to compare two files, or even a directory of files containing tables.
###### Details to be added
[![Build Status](https://travis-ci.org/caseyclements/pennies.svg?branch=master)](https://travis-ci.org/caseyclements/pennies)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for df_compare-0.0.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 90b22618d81196eb2a9bdd27336a881dd56e79fd8d78f4e802d26f45ec800a8a |
|
MD5 | 9b1a4bda7d189e30743a3d91ec824948 |
|
BLAKE2b-256 | 37d46de84980a02d19595c5817bba0900c76fdd1039b5c22bdcd9cf53f1ee0c6 |