Skip to main content

Unified diff parsing/metadata extraction library.

Project description

Simple Python library to parse and interact with unified diff data.

The original version seems not to be actively maintained. This repository intend to offer an actively maintained version of unidiff.

Installing unidiff

$ pip install unidiff2

Quick start

>>> import urllib.request
>>> from unidiff import PatchSet
>>> diff = urllib.request.urlopen('https://github.com/matiasb/python-unidiff/pull/3.diff')
>>> encoding = diff.headers.get_charsets()[0]
>>> patch = PatchSet(diff, encoding=encoding)
>>> patch
<PatchSet: [<PatchedFile: .gitignore>, <PatchedFile: unidiff/patch.py>, <PatchedFile: unidiff/utils.py>]>
>>> patch[0]
<PatchedFile: .gitignore>
>>> patch[0].is_added_file
True
>>> patch[0].added
6
>>> patch[1]
<PatchedFile: unidiff/patch.py>
>>> patch[1].added, patch[1].removed
(20, 11)
>>> len(patch[1])
6
>>> patch[1][2]
<Hunk: @@ 109,14 110,21 @@ def __repr__(self):>
>>> patch[2]
<PatchedFile: unidiff/utils.py>
>>> print(patch[2])
diff --git a/unidiff/utils.py b/unidiff/utils.py
index eae63e6..29c896a 100644
--- a/unidiff/utils.py
+++ b/unidiff/utils.py
@@ -37,4 +37,3 @@
# - deleted line
# \ No newline case (ignore)
RE_HUNK_BODY_LINE = re.compile(r'^([- \+\\])')
-

Load unified diff data by instantiating PatchSet with a file-like object as argument, or using PatchSet.from_filename class method to read diff from file.

A PatchSet is a list of files updated by the given patch. For each PatchedFile you can get stats (if it is a new, removed or modified file; the source/target lines; etc), besides having access to each hunk (also like a list) and its respective info.

At any point you can get the string representation of the current object, and that will return the unified diff data of it.

As a quick example of what can be done, check bin/unidiff file.

Also, once installed, unidiff provides a command-line program that displays information from diff data (a file, or stdin). For example:

$ git diff | unidiff
Summary
-------
README.md: +6 additions, -0 deletions

1 modified file(s), 0 added file(s), 0 removed file(s)
Total: 6 addition(s), 0 deletion(s)

Load a local diff file

To instantiate PatchSet from a local file, you can use:

>>> from unidiff import PatchSet
>>> patch = PatchSet.from_filename('tests/samples/bzr.diff', encoding='utf-8')
>>> patch
<PatchSet: [<PatchedFile: added_file>, <PatchedFile: modified_file>, <PatchedFile: removed_file>]>

Notice the (optional) encoding parameter. If not specified, unicode input will be expected. Or alternatively:

>>> import codecs
>>> from unidiff import PatchSet
>>> with codecs.open('tests/samples/bzr.diff', 'r', encoding='utf-8') as diff:
...     patch = PatchSet(diff)
...
>>> patch
<PatchSet: [<PatchedFile: added_file>, <PatchedFile: modified_file>, <PatchedFile: removed_file>]>

Finally, you can also instantiate PatchSet passing any iterable (and encoding, if needed):

>>> from unidiff import PatchSet
>>> with open('tests/samples/bzr.diff', 'r') as diff:
...     data = diff.readlines()
...
>>> patch = PatchSet(data)
>>> patch
<PatchSet: [<PatchedFile: added_file>, <PatchedFile: modified_file>, <PatchedFile: removed_file>]>

If you don’t need to be able to rebuild the original unified diff input, you can pass metadata_only=True (defaults to False), which should help making the parsing more efficient:

>>> from unidiff import PatchSet
>>> patch = PatchSet.from_filename('tests/samples/bzr.diff', encoding='utf-8', metadata_only=True)

References

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

unidiff2-0.7.8.tar.gz (22.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

unidiff2-0.7.8-py3-none-any.whl (13.9 kB view details)

Uploaded Python 3

File details

Details for the file unidiff2-0.7.8.tar.gz.

File metadata

  • Download URL: unidiff2-0.7.8.tar.gz
  • Upload date:
  • Size: 22.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for unidiff2-0.7.8.tar.gz
Algorithm Hash digest
SHA256 40865c1037f4d27475d9149ebf2dc2ef1e11d62bf159ab127186ed26d3d61c44
MD5 cb0d17952270baed330483a469354929
BLAKE2b-256 49d2d26bbf825d8d9e0386c0c980cc10435861b5a31ec4f3683d060db0ad85b0

See more details on using hashes here.

File details

Details for the file unidiff2-0.7.8-py3-none-any.whl.

File metadata

  • Download URL: unidiff2-0.7.8-py3-none-any.whl
  • Upload date:
  • Size: 13.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for unidiff2-0.7.8-py3-none-any.whl
Algorithm Hash digest
SHA256 79e982c799a6604df6c0da2d75bf3693f250bed0d8ed2170f2ab8a7d49835aad
MD5 4b3747a87f1063d36b59010b3425818c
BLAKE2b-256 6e45a6234f5d515eacb2d8d37225264565992a8a916bbe4d127e8b41279c53d4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page