Utility for comparing results between data sources
Project description
Comparator
Comparator is a utility for comparing the results of queries run against two databases. Future development will include support for APIs, static files, and more.
Installation
pip install comparator
Usage
Overview
from comparator import Comparator
from comparator.config import DbConfig
from comparator.db import PostgresDb
conf = DbConfig()
l = PostgresDb(**conf.default)
r = PostgresDb(**conf.other_db)
query = 'SELECT * FROM my_table ORDER BY 1'
c = Comparator(l, r, query)
c.run_comparisons()
[('first_eq_comp', True)]
Included Comparisons
There are some basic comparisons included, and they can be imported and passed using constants.
from comparator.comps import BASIC_COMP, LEN_COMP
c = Comparator(l, r, query, comparisons=[BASIC_COMP, LEN_COMP])
c.run_comparisons()
[('basic_comp', True), ('len_comp', True)]
Queries and Exceptions
It's possible to run different queries against each database. You can raise exceptions if that's your speed.
lq = 'SELECT * FROM my_table ORDER BY 1'
rq = 'SELECT id, uuid, name FROM reporting.my_table ORDER BY 1'
comparisons = [BASIC_COMP, LEN_COMP]
c = Comparator(l, r, left_query=lq, right_query=rq, comparisons=comparisons)
for name, success in c.compare():
if not success:
raise Exception('{} check failed!'.format(name))
Custom Comparisons
Finally, you'll probably want to define your own comparison checks. You can do so by defining functions that accept left
and right
args, which, if coming from one of the included database classes, will be a list of tuples representing your query result. Perform whatever magic you like, and return a boolean.
def left_is_longer(left, right):
# Return True if left contains more rows than right
return len(left) > len(right)
def totals_are_equal(left, right):
# Return True if sum(left) == sum(right)
sl = sr = 0
for row in left:
sl += int(row[1])
for row in right:
sr += int(row[1])
return sl == sr
c = Comparator(l, r, query, comparisons=[left_is_longer, totals_are_equal])
c.run_comparisons()
[('left_is_longer', False), ('totals_are_equal', True)]
Changelog
0.1.1 (2018-09-18)
- add
query_df
methods for returning pandas DataFrames - add
output
kwarg to Comparator to allow calling thequery_df
method
0.1.0 (2018-09-12)
- initial release
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.