Skip to main content

Tools to diff json format data.

Project description

diffjson

Utilities to diff json data. https://nfwprod.github.io/diffjson/

Features

  • Enable search like XPATH.
  • Enable diff for multi json data.

Branch Class for Search

Branch Class

Convert dict/list json data to Branch class format by 'generate_branch'. Branch class is hierachical class tree.

  • RootBranch
    • Root for all child branches.
    • Provide search and other methods for users.
  • DictBranch
    • Child branch for dict format child.
  • ListBranch
    • Child branch for list format child.
  • Leaf
    • Edge branch for str, bool, int, and float.
  • Branch
    • Branch common class for inheritate.

Search

Branch class accept search with xpath like strings. (Strongly inspired by josonpath-ng!!)

# Example Data, sample.yaml
branch01:
  b01-01: string
  b01-02: 1
  b01-03: 2.0
  b01-04: True
branch02:
  b02-01:
    - name: n02-01-i01
      value: v02-01-i01
    - name: n02-01-i02
      value:
        b02-01-i02-01:
          - name: n02-01-i02-01-i01
            value: v02-01-i02-01-i01
          - name: n02-01-i02-01-i02
            value: v02-01-i02-01-i02
          - name: n02-01-i02-01-i03
            value: v02-01-i02-01-i03
        b02-01-i02-02:
          name: n02-01-i02-02
          value: v02-01-i02-02
    - name: n02-01-i03
      value:
        b02-01-i03-01: v02-01-i03-01
branch03:
  b03-01: null
import diffjson
import yaml

with open('sample.yaml'), 'r') as f:
  sampledata = yaml.safe_load(f)

# Get dict format data under b01-01
b = diffjson.generate_branch(sampledata)
result = b.search('/branch01/b01-01')

print(result)
> ['string']

Search returns all matched data as List style.

DiffBranch Class for Diff JSON Data

DiffBranch Class

Diff multi json data as Branch instance by 'diff_branches'.

  • DiffRootBranch
    • Root for all child diff branches.
  • DiffCommonBranch
    • Child branch for all data format.
    • Diff for dict, list and leaf are contained in DiffCommonBranch.Branch.
  • DiffBranch
    • Branch common class for inheritate.

Diff

diffbranch = diffjson.diff_branch([data01, data02, data03])

# Export diff in csv format
diffbranch.export_csv('./diff.csv')

NodenameMasks Options

Sometimes data are contained in list format and orders are changes randomly. For example,,

Data Before.

- name: id01
  value: data01
- name: id02
  value: data02
- name: id03
  value: data03

Data After.

- name: id01
  value: data01
- name: id03
  value: data03
- name: id02
  value: changed

We want to diff "name: id02" and "name: id02". Don't want to diff second data "name: id02" and "name: id03".

For such case, use mask function.

masks = {'/': lambda x: x['name'], '/branch01': lambda x: x['id']}

diffbranch = diffjson.diff_branch(
                [data01, data02, data03],
                nodenamemasks=masks)

Nodename masks convert list part to dict part with lambda generated key like follows.

Data Before

id01:
  name: id01
  value: data01
id02:
  name: id02
  value: data02
id03:
  name: id03
  value: data03

Data After

id01:
  name: id01
  value: data01
id02:
  name: id02
  value: changed
id03:
  name: id03
  value: data03

Diff Search

DiffBranch class accept search.

Arguments are as follows.

  • locationpath_string(str): XPath format search string.
  • details(bool, option): Return searched path with value, default: False(value only).
  • dump_mode(str, option): DiffTwoBranch only.
    • 'all': Dump all branch.
    • 'added': Dump added branch. Include added root and children.
    • 'bulk_added': Dump added root branch, ignore children of them.
    • 'removed': Dump removed branch. Include added root and children.
    • 'bulk_removed': Dump removed root branch, ignore children of them.
    • 'changed': Dump changed branch only.
diffbranch = diffjson.diff_branch([data01, data02, data03])

# Search
diffbranch.search('//', dump_mode='added', details=True)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

diffjson-0.1.2.tar.gz (16.4 kB view details)

Uploaded Source

Built Distribution

diffjson-0.1.2-py3-none-any.whl (17.7 kB view details)

Uploaded Python 3

File details

Details for the file diffjson-0.1.2.tar.gz.

File metadata

  • Download URL: diffjson-0.1.2.tar.gz
  • Upload date:
  • Size: 16.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/45.2.0 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.8.5

File hashes

Hashes for diffjson-0.1.2.tar.gz
Algorithm Hash digest
SHA256 bfab6bc0cb79144b5b045f964166c7cf876549e37a35d0e94682984f8e7f2194
MD5 9b920788a55aa065e5ee1d42e9d42eb3
BLAKE2b-256 618518080c4512fdf7c66c98a3b28c4e0f97f5999fb1264b0ec09ac4732dda57

See more details on using hashes here.

File details

Details for the file diffjson-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: diffjson-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 17.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/45.2.0 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.8.5

File hashes

Hashes for diffjson-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 1bdfee96f5eb0b63c09e442689e9d34f1acba056b9296430c72284c01987a722
MD5 4c987aeda2310086ce030ade6ae1a16f
BLAKE2b-256 0b7c4fc92f099c197655be1de3106c90546c48771a7a2b4b32c217488ab7a17e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page