A Python library for comparing nested data structures with detailed diff reporting and interactive navigation.
Project description
diffgetr
A Python library for comparing nested data structures with detailed diff reporting and interactive navigation.
Features
- Compare deeply nested dictionaries and lists with customizable precision
- Side-by-side tabular comparison with percentage changes for numeric values
- Summarize differences with key frequency counts and pattern recognition
- Navigate interactively through diff results using dictionary-like syntax
- Support for array indexing and complex nested paths
- Multiple output formats: summary, detailed, and tabular side-by-side
- UUID and CSV pattern recognition for cleaner diff summaries
- Configurable DeepDiff parameters for fine-tuned comparisons
- Option to ignore added items for focused change analysis
- Command-line tool for JSON file comparison with path navigation
Installation
pip install .
Usage
As a Library
Basic Usage
from diffgetr.diff_get import diff_get
# Basic comparison
diff = diff_get(obj1, obj2)
print(diff) # Prints a summary of differences
# Navigate to specific parts
sub_diff = diff['key1']['nested_key']
print(sub_diff)
Advanced Configuration
# Custom DeepDiff parameters
diff = diff_get(
obj1, obj2,
deep_diff_kw={'significant_digits': 5, 'ignore_string_case': True},
ignore_added=True # Focus only on changes and removals
)
# Different output formats
diff.diff_summary() # Print summary to stdout
diff.diff_all(indent=4) # Print full diff details
diff.diff_sidebyside() # Tabular side-by-side comparison with % changes
raw_diff = diff.diff_obj # Access underlying DeepDiff object
Interactive Navigation
# Navigate through nested structures
diff = diff_get(data1, data2)
# Use tab completion to see available keys
dir(diff) # Shows common keys between both datasets
# Navigate with array indices
item_diff = diff['items'][0]['properties']
# Check current location
print(diff.location) # Shows path like 'root.items[0].properties'
Command Line
diffgetr file1.json file2.json path.to.key
Parameters:
file1.json,file2.json: JSON files to comparepath.to.key: Dot-separated path to navigate in the structure
Path Examples:
users.0.profile- Navigate to first user's profiledata.items[5].name- Navigate to name of 6th itemconfig.database- Navigate to database configuration
API Reference
Constructor Parameters
diff_get(s0, s1, loc=None, path=None, deep_diff_kw=None, ignore_added=False)
Parameters:
s0,s1: Objects to compareloc: Internal location tracking (used recursively)path: Path component to append to locationdeep_diff_kw: Dictionary of parameters passed to DeepDiff (default:{'ignore_numeric_type_changes': True, 'significant_digits': 3})ignore_added: If True, ignore items that were added in s1 but not in s0
Methods
diff_summary(file=None, top=50, bytes=None)
Generate a summary of differences with pattern recognition and frequency counts.
Parameters:
file: Output file object (default: stdout)top: Maximum number of diff patterns to show per categorybytes: Whether to write bytes (auto-detected if None)
diff_all(indent=2, file=None)
Print complete diff details with full data structures.
Parameters:
indent: Indentation level for pretty printingfile: Output file object (default: stdout)
diff_sidebyside()
Display differences in a tabular side-by-side format with percentage changes for numeric values.
Features:
- Flattens nested structures into dot-notation keys
- Groups missing/added keys by parent for compact display
- Groups differences by common parent keys
- Shows percentage differences for numeric values
- Filters changes based on significant digits threshold
- Displays missing keys as
<MISSING> - Sorts by frequency of changes within each group
Properties
location: Current path in dot notation (e.g., 'root.data.items[0]')diff_obj: Underlying DeepDiff object for advanced operations
Pattern Recognition
The tool automatically recognizes and abstracts common patterns:
- UUIDs: Replaced with
<UUID>for cleaner summaries - CSV-like numbers: Numeric sequences replaced with
<CSV> - Path normalization: Consistent path formatting across different access patterns
Error Handling
When navigating to non-existent keys, the tool will:
- Display a diff summary showing available keys
- Raise a KeyError with location information
- Continue execution for batch operations
Examples
Comparing Configuration Files
import json
from diffgetr.diff_get import diff_get
with open('config_v1.json') as f1, open('config_v2.json') as f2:
config1 = json.load(f1)
config2 = json.load(f2)
diff = diff_get(config1, config2, ignore_added=True)
print(f"Changes found at: {diff.location}")
diff.diff_summary(top=20)
Analyzing API Response Changes
# Compare two API responses with high precision
diff = diff_get(
response1, response2,
deep_diff_kw={'significant_digits': 6, 'ignore_order': True}
)
# Navigate to specific sections
user_diff = diff['users'][0]['profile']
if user_diff:
user_diff.diff_all()
Side-by-Side Comparison
# For detailed tabular comparison with percentage changes
diff = diff_get(financial_data_old, financial_data_new)
diff.diff_sidebyside()
# Output example:
# KEY | s0 | s1 | % DIFF
# -------------------------------------------------------------------------------------------------------------
#
# GROUP: root.quarterly_results
# - .q1.revenue | 1250000.0 | 1340000.0 | 7.200%
# - .q1.expenses | 980000.0 | 1020000.0 | 4.082%
# - .q2.revenue | 1180000.0 | 1290000.0 | 9.322%
#
# GROUP: root.metadata
# - .last_updated | "2024-12-01" | "2025-01-15"
# - .version | "1.2.3" | "1.3.0"
Testing
Run the comprehensive test suite to verify functionality:
python -m unittest discover tests -v
The test suite covers: • Core diff functionality and navigation through nested structures • Multiple output formats (summary, detailed, side-by-side) • Pattern recognition for UUIDs and CSV-like data • Error handling and edge cases • IPython integration and tab completion • Command-line interface functionality
Contributing
This tool is part of the SMART_X project ecosystem. When contributing:
- Maintain backward compatibility with existing APIs
- Add tests for new pattern recognition features
- Update documentation for any new navigation capabilities
- Consider performance impact for large nested structures
Version History
- 0.1.0: Initial release with basic diff comparison
- Current: Enhanced with interactive navigation, pattern recognition, and configurable output formats
License
MIT
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file diffgetr-0.1.0.tar.gz.
File metadata
- Download URL: diffgetr-0.1.0.tar.gz
- Upload date:
- Size: 9.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
45972a7a967dfe346458748aceec7c2979dd9b13ed56be9e28bd9c005ad9d307
|
|
| MD5 |
03d4437a379a082919ead6a345b102cd
|
|
| BLAKE2b-256 |
b28ebf1573c9fd20374f50674c04da3cf18e6ba947ffd876bb6528b9496b3832
|
Provenance
The following attestation bundles were made for diffgetr-0.1.0.tar.gz:
Publisher:
main.yml on SoundsSerious/diffgetr
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
diffgetr-0.1.0.tar.gz -
Subject digest:
45972a7a967dfe346458748aceec7c2979dd9b13ed56be9e28bd9c005ad9d307 - Sigstore transparency entry: 424846127
- Sigstore integration time:
-
Permalink:
SoundsSerious/diffgetr@1e8eb95cda526548857792fcf58655745776674b -
Branch / Tag:
refs/heads/main - Owner: https://github.com/SoundsSerious
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
main.yml@1e8eb95cda526548857792fcf58655745776674b -
Trigger Event:
push
-
Statement type:
File details
Details for the file diffgetr-0.1.0-py3-none-any.whl.
File metadata
- Download URL: diffgetr-0.1.0-py3-none-any.whl
- Upload date:
- Size: 8.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0df47c8ffce5a42af438442d3f48a399cc75a364f53bb0d0b65726935391f108
|
|
| MD5 |
d2e1de4436b77c31e6c58868b97033fa
|
|
| BLAKE2b-256 |
b6725e7c0a028a7639a022f71afdbf1568550f2ceb677db2cba7629f458431ef
|
Provenance
The following attestation bundles were made for diffgetr-0.1.0-py3-none-any.whl:
Publisher:
main.yml on SoundsSerious/diffgetr
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
diffgetr-0.1.0-py3-none-any.whl -
Subject digest:
0df47c8ffce5a42af438442d3f48a399cc75a364f53bb0d0b65726935391f108 - Sigstore transparency entry: 424846145
- Sigstore integration time:
-
Permalink:
SoundsSerious/diffgetr@1e8eb95cda526548857792fcf58655745776674b -
Branch / Tag:
refs/heads/main - Owner: https://github.com/SoundsSerious
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
main.yml@1e8eb95cda526548857792fcf58655745776674b -
Trigger Event:
push
-
Statement type: