Skip to main content

Lightweight Python package for parsing Python difflib's diff results

Project description

difflib-parser

Parser for Python's difflib output.

Built on top of https://github.com/yebrahim/difflibparser/blob/master/difflibparser.py

Key changes from above library:

  1. Using generator pattern instead of using iterator pattern when iterating over diffs
  2. Using @dataclass over generic dictionaries to enforce strict typing
  3. Using type annotations for strict typing

Getting started

pip install difflib-parser
from difflib_parser import difflib_parser

parser = difflib_parser.DiffParser(["hello world"], ["hello world!"])
for diff in parser.iter_diffs():
  print(diff)

Diff structure

class DiffCode(Enum):
    SAME = 0
    RIGHT_ONLY = 1
    LEFT_ONLY = 2
    CHANGED = 3


@dataclass
class Diff:
    code: DiffCode
    line: str
    left_changes: List[int] | None = None
    right_changes: List[int] | None = None
    newline: str | None = None

What is difflib?

A difflib output might look something like this:

>>> import difflib
>>> print("\n".join(list(difflib.ndiff(["hello world"], ["hola world"]))))
- hello world
?  ^ ^^

+ hola world
?  ^ ^

The specifics of diff interpretation can be found in the documentation.

Parsing difflib

There are concretely four types of changes we are interested in:

  1. No change
  2. A new line is added
  3. An existing line is removed
  4. An existing line is edited

Given that the last two cases operate on existing lines, they will always be preceded by - . As such, we need to handle them delicately.

If an existing line is removed, it will not have any follow-up lines.

If an existing line is edited, it will have several follow-up lines that provide details on the values that have been changed.

From these follow-up lines, we can further case the changes made to a line:

  1. Only additions made (i.e. "Hello world" -> "Hello world!")
  2. Only removals made (i.e. "Hello world" -> "Hllo world")
  3. Both additions and removals made (i.e. "Hello world" -> "Hola world!")

Each of them have their unique follow-up lines:

  1. -, +, ?
>>> print("\n".join(list(difflib.ndiff(["hello world"], ["hello world!"]))))
- hello world
+ hello world!
?            +
  1. -, ?, +
>>> print("\n".join(list(difflib.ndiff(["hello world"], ["hllo world"]))))
- hello world
?  -

+ hllo world
  1. -, ?, +, ?
>>> print("\n".join(list(difflib.ndiff(["hello world"], ["helo world!"]))))
- hello world
?    -

+ helo world!
?           +

As such, we have included them as separate patterns to process.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

difflib_parser-2.1.1.tar.gz (6.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

difflib_parser-2.1.1-py3-none-any.whl (7.0 kB view details)

Uploaded Python 3

File details

Details for the file difflib_parser-2.1.1.tar.gz.

File metadata

  • Download URL: difflib_parser-2.1.1.tar.gz
  • Upload date:
  • Size: 6.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.11

File hashes

Hashes for difflib_parser-2.1.1.tar.gz
Algorithm Hash digest
SHA256 d32c40908cb9ca4a5e28bea8b7fcba670dab51d2f0a8870b1d947cb4eca49fa6
MD5 b90d90cff1cc4bee999fd2518ee9aba0
BLAKE2b-256 8a49cbeb0d6d65d30b2651d97e08b2e9ca4a1f306589de02b43d1f3b27c5580e

See more details on using hashes here.

File details

Details for the file difflib_parser-2.1.1-py3-none-any.whl.

File metadata

  • Download URL: difflib_parser-2.1.1-py3-none-any.whl
  • Upload date:
  • Size: 7.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.11

File hashes

Hashes for difflib_parser-2.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 bf5e5fa37bba289530e975b7d89919fff710532d8457128e38b823eb8c81e272
MD5 8e487e6a7b40d3afd244dc783cce6ad8
BLAKE2b-256 0da92e48bb87ed8332145daa498c195d1cdb913320f68fb9f009d4dffe1670f5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page