Skip to main content

A tool to find the differences between two tables.

Project description

polars_compare

A tool to compare and find the differences between two Polars DataFrames.

To DO:

  • Linting (Ruff)
  • Make into python package
  • Add makefile for easy linting and tests
  • Statistics should indicate which statistics are referencing columns
  • Add all statistics frame to tests
  • Add schema differences to schema summary
  • Make row examples alternate between base only and compare only so that it is more readable.
  • Add limit value to the examples.
  • Updated value differences summary so that Statistic is something that makes sense.
  • Publish package to pypi
  • Add difference criterion.
  • Add license
  • [] Make package easy to use (i.e. so you only have to import pl_compare and then you can us pl_compare)
  • [] Raise error and print examples if duplicates are present.
  • [] Add a count of the number of rows that have any differences to the value differences summary.
  • [] Add total number of value differences to the value differences summary.
  • [] Add parameter to hide column differences with 0 differences.
  • [] Update report so that non differences are not displayed.
  • [] Add table name labels that can replace 'base' and 'compare'.
  • [] Change id_columns to be named 'join_on' and add a test that checks that abritrary join conditions work.
  • [] Update code to use a config dataclass that can be passed between the class and functions.
  • [] Test for large amounts of data
  • [] Benchmark for different sizes of data.
  • [] Write up docstrings
  • [] Write up readme (with code examples)
  • [] strict MyPy type checking
  • [] Github actions for testing
  • [] Github actions for linting
  • [] Github actions for publishing

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pl_compare-0.1.3.tar.gz (7.5 kB view hashes)

Uploaded Source

Built Distribution

pl_compare-0.1.3-py3-none-any.whl (8.5 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page