Lightweight Python package for parsing Python difflib's diff results
Project description
difflib-parser
Parser for Python's difflib output.
Built on top of https://github.com/yebrahim/difflibparser/blob/master/difflibparser.py
Key changes from above library:
- Using generator pattern instead of using iterator pattern when iterating over diffs
- Using
@dataclassover generic dictionaries to enforce strict typing - Using type annotations for strict typing
Getting started
pip install difflib-parser
from difflib_parser import difflib_parser
parser = difflib_parser.DiffParser(["hello world"], ["hello world!"])
for diff in parser.iter_diffs():
print(diff)
Diff structure
class DiffCode(Enum):
SAME = 0
RIGHT_ONLY = 1
LEFT_ONLY = 2
CHANGED = 3
@dataclass
class Diff:
code: DiffCode
line: str
left_changes: List[int] | None = None
right_changes: List[int] | None = None
newline: str | None = None
What is difflib?
A difflib output might look something like this:
>>> import difflib
>>> print("\n".join(list(difflib.ndiff(["hello world"], ["hola world"]))))
- hello world
? ^ ^^
+ hola world
? ^ ^
The specifics of diff interpretation can be found in the documentation.
Parsing difflib
There are concretely four types of changes we are interested in:
- No change
- A new line is added
- An existing line is removed
- An existing line is edited
Given that the last two cases operate on existing lines, they will always be preceded by - . As such, we need to handle them delicately.
If an existing line is removed, it will not have any follow-up lines.
If an existing line is edited, it will have several follow-up lines that provide details on the values that have been changed.
From these follow-up lines, we can further case the changes made to a line:
- Only additions made (i.e.
"Hello world"->"Hello world!") - Only removals made (i.e.
"Hello world"->"Hllo world") - Both additions and removals made (i.e.
"Hello world"->"Hola world!")
Each of them have their unique follow-up lines:
-,+,?
>>> print("\n".join(list(difflib.ndiff(["hello world"], ["hello world!"]))))
- hello world
+ hello world!
? +
-,?,+
>>> print("\n".join(list(difflib.ndiff(["hello world"], ["hllo world"]))))
- hello world
? -
+ hllo world
-,?,+,?
>>> print("\n".join(list(difflib.ndiff(["hello world"], ["helo world!"]))))
- hello world
? -
+ helo world!
? +
As such, we have included them as separate patterns to process.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file difflib_parser-2.1.1.tar.gz.
File metadata
- Download URL: difflib_parser-2.1.1.tar.gz
- Upload date:
- Size: 6.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d32c40908cb9ca4a5e28bea8b7fcba670dab51d2f0a8870b1d947cb4eca49fa6
|
|
| MD5 |
b90d90cff1cc4bee999fd2518ee9aba0
|
|
| BLAKE2b-256 |
8a49cbeb0d6d65d30b2651d97e08b2e9ca4a1f306589de02b43d1f3b27c5580e
|
File details
Details for the file difflib_parser-2.1.1-py3-none-any.whl.
File metadata
- Download URL: difflib_parser-2.1.1-py3-none-any.whl
- Upload date:
- Size: 7.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bf5e5fa37bba289530e975b7d89919fff710532d8457128e38b823eb8c81e272
|
|
| MD5 |
8e487e6a7b40d3afd244dc783cce6ad8
|
|
| BLAKE2b-256 |
0da92e48bb87ed8332145daa498c195d1cdb913320f68fb9f009d4dffe1670f5
|