Skip to main content

Simple command line tool to reconcile datasets

Project description

recon-cli: Simple command line tool to reconcile datasets

What is it

recon-cli is a Python package and cli tool to reconcile datasets against each other using a common field. It aims to be provide a simple interface for reliable reconciliations, removing common logic errors made when performing reconciliations.

Where to get it

The source code is currently hosted on GitHub at: https://github.com/mynhardtburger/recon-cli

Binary installers for the latest released version are available at the Python Package Index (PyPI)

For commandline use install recon-cli via pipx:

# For command line usage
pipx install recon-cli
recon --help

To use it within your own project install via pip:

# PyPI install
pip install recon-cli

Usage

CLI

 Usage: recon [OPTIONS] LEFT RIGHT LEFT_ON RIGHT_ON

╭─ Arguments ──────────────────────────────────────────────────────────────────────────────────────╮
│ *    left          FILE  Path to the left dataset. [required]                                    │
│ *    right         FILE  Path to the right dataset. [required]                                   │
│ *    left_on       TEXT  Reconcile using this field from the left dataset. [required]            │
│ *    right_on      TEXT  Reconcile using this field from the right dataset. [required]           │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
╭─ Options ────────────────────────────────────────────────────────────────────────────────────────╮
│ --help          Show this message and exit.                                                      │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
╭─ Input options ──────────────────────────────────────────────────────────────────────────────────╮
│ --left-suffix     TEXT  Suffix to append to the left dataset's column names. [default: _left]    │
│ --right-suffix    TEXT  Suffix to append to the right dataset's column names. [default: _right]  │
│ --left-sheet      TEXT  Sheet to read from left if left is a spreadsheet. [default: Sheet1]      │
│ --right-sheet     TEXT  Sheet to read from left if left is a spreadsheet. [default: Sheet1]      │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
╭─ Output options ─────────────────────────────────────────────────────────────────────────────────╮
│ --output-file                  TEXT  Path to save results (in xlsx format) to.                   │
│ --std-out        --no-std-out          Print results to stdout. [default: no-std-out]            │
│ --info-only      --no-info-only        Print summary results only. [default: no-info-only]       │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯

Python

# Recon import
from recon import Reconcile

# Read from files
recon = Reconcile.read_files(
    left_file="sales_orders.xlsx",
    right_file="deliveries.csv",
    left_on="Document #",
    right_on="Sales Order #",
    left_kwargs={"sheet_name": "Sales"},
)

# Or read from pandas dataframes
recon = Reconcile.read_df(
    left_df=sales_df,
    right_df=deliveries_df,
    left_on="Document #",
    right_on="Sales Order #",
)

# Properties:
# Components of the recon are lazily evaluated and cached as you access the relevant properties.
# All properties return a pandas DataFrame.
# The original indexes are preserved, except for the .both property where the original indexes are columns.
recon.left  # Original left dataset
recon.right  # Original right dataset

recon.left_only  # Records from left dataset which is not found in the right dataset.
recon.right_only  # Records from right dataset which is not found in the left dataset.

recon.left_duplicate  # Duplicate records in the left dataset. The first record is not listed.
recon.right_duplicate  # Duplicate records in the right dataset. The first record is not listed.

recon.left_both  # Records from left dataset which is also found in the right dataset.
recon.right_both # Records from right dataset which is also found in the left dataset.

recon.both  # Common records. A merger of both datasets
recon.all_data # All records classified. A merger of both datasets

recon.is_left_unique  # bool. Are there duplicate records within the `left_on` field?
recon.is_right_unique  # bool. Are there duplicate records within the `right_on` field?
recon.relationship  # 1:1, 1:m, m:1 or m:m relationship between datasets

# Output methods:
# `recon_components` parameter is an ordered list of any of the DataFrame property names.
# "all" is a shorthand for most properties.
recon.info()  # Prints a summary of recon results
recon.to_stdout(recon_components=["all"]) # Prints all recon results to console
recon.to_xlsx(path="recon_results.xlsx", recon_components=["all"]) # Saves all recon results to xlsx
recon.to_object() # returns a ReconciledReport object

Dependencies

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

recon_cli-0.0.5.tar.gz (12.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

recon_cli-0.0.5-py3-none-any.whl (8.5 kB view details)

Uploaded Python 3

File details

Details for the file recon_cli-0.0.5.tar.gz.

File metadata

  • Download URL: recon_cli-0.0.5.tar.gz
  • Upload date:
  • Size: 12.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-requests/2.31.0

File hashes

Hashes for recon_cli-0.0.5.tar.gz
Algorithm Hash digest
SHA256 ac04b1708c72f3d19942f9a523595030b48f9b4c2e2fe4b1b8e18e48da72f887
MD5 751dc99d83c220bef1859c1430d87a9f
BLAKE2b-256 d5f194b8babdb8db198f9232bccc7329374b316b7c2fdc1ab834cc5f81063ed2

See more details on using hashes here.

File details

Details for the file recon_cli-0.0.5-py3-none-any.whl.

File metadata

  • Download URL: recon_cli-0.0.5-py3-none-any.whl
  • Upload date:
  • Size: 8.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-requests/2.31.0

File hashes

Hashes for recon_cli-0.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 8c52d940291dab72b614c1b8caaf5522d2eb34c271c9d7629f1ca0e3b46b195c
MD5 fbab39cb94611395ecd45cbb02b3a43a
BLAKE2b-256 1c9aaa3cae4dff94cb46cf1c5360b26a7e9eceb7336947eb779eba8235e5a027

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page