Skip to main content

Tools to diff json format data.

Project description

diffjson

Utilities to diff json data. https://nfwprod.github.io/diffjson/

Features

  • Enable search like XPATH.
  • Enable diff for multi json data.

Branch Class for Search

Branch Class

Convert dict/list json data to Branch class format by 'generate_branch'. Branch class is hierachical class tree.

  • RootBranch
    • Root for all child branches.
    • Provide search and other methods for users.
  • DictBranch
    • Child branch for dict format child.
  • ListBranch
    • Child branch for list format child.
  • Leaf
    • Edge branch for str, bool, int, and float.
  • Branch
    • Branch common class for inheritate.

Search

Branch class accept search with xpath like strings. (Strongly inspired by josonpath-ng!!)

# Example Data, sample.yaml
branch01:
  b01-01: string
  b01-02: 1
  b01-03: 2.0
  b01-04: True
branch02:
  b02-01:
    - name: n02-01-i01
      value: v02-01-i01
    - name: n02-01-i02
      value:
        b02-01-i02-01:
          - name: n02-01-i02-01-i01
            value: v02-01-i02-01-i01
          - name: n02-01-i02-01-i02
            value: v02-01-i02-01-i02
          - name: n02-01-i02-01-i03
            value: v02-01-i02-01-i03
        b02-01-i02-02:
          name: n02-01-i02-02
          value: v02-01-i02-02
    - name: n02-01-i03
      value:
        b02-01-i03-01: v02-01-i03-01
branch03:
  b03-01: null
import diffjson
import yaml

with open('sample.yaml'), 'r') as f:
  sampledata = yaml.safe_load(f)

# Get dict format data under b01-01
b = diffjson.generate_branch(sampledata)
result = b.search('/branch01/b01-01')

print(result)
> ['string']

Search returns all matched data as List style.

DiffBranch Class for Diff JSON Data

DiffBranch Class

Diff multi json data as Branch instance by 'diff_branches'.

  • DiffRootBranch
    • Root for all child diff branches.
  • DiffCommonBranch
    • Child branch for all data format.
    • Diff for dict, list and leaf are contained in DiffCommonBranch.Branch.
  • DiffBranch
    • Branch common class for inheritate.

Diff

diffbranch = diffjson.diff_branch([data01, data02, data03])

# Export diff in csv format
diffbranch.export_csv('./diff.csv')

NodenameMasks Options

Sometimes data are contained in list format and orders are changes randomly. For example,,

Data Before.

- name: id01
  value: data01
- name: id02
  value: data02
- name: id03
  value: data03

Data After.

- name: id01
  value: data01
- name: id03
  value: data03
- name: id02
  value: changed

We want to diff "name: id02" and "name: id02". Don't want to diff second data "name: id02" and "name: id03".

For such case, use mask function.

masks = {'/': lambda x: x['name'], '/branch01': lambda x: x['id']}

diffbranch = diffjson.diff_branch(
                [data01, data02, data03],
                nodenamemasks=masks)

Nodename masks convert list part to dict part with lambda generated key like follows.

Data Before

id01:
  name: id01
  value: data01
id02:
  name: id02
  value: data02
id03:
  name: id03
  value: data03

Data After

id01:
  name: id01
  value: data01
id02:
  name: id02
  value: changed
id03:
  name: id03
  value: data03

Diff Search

DiffBranch class accept search.

Arguments are as follows.

  • locationpath_string(str): XPath format search string.
  • details(bool, option): Return searched path with value, default: False(value only).
  • dump_mode(str, option): DiffTwoBranch only.
    • 'all': Dump all branch.
    • 'added': Dump added branch. Include added root and children.
    • 'bulk_added': Dump added root branch, ignore children of them.
    • 'removed': Dump removed branch. Include added root and children.
    • 'bulk_removed': Dump removed root branch, ignore children of them.
    • 'changed': Dump changed branch only.
diffbranch = diffjson.diff_branch([data01, data02, data03])

# Search
diffbranch.search('//', dump_mode='added', details=True)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

diffjson-0.1.2.tar.gz (16.4 kB view hashes)

Uploaded Source

Built Distribution

diffjson-0.1.2-py3-none-any.whl (17.7 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page