Skip to main content
This is a pre-production deployment of Warehouse. Changes made here affect the production instance of PyPI (
Help us improve Python packaging - Donate today!

Unified diff parsing/metadata extraction library.

Project Description

Simple Python library to parse and interact with unified diff data.

Installing unidiff

$ pip install unidiff

Quick start

>>> import urllib2
>>> from unidiff import PatchSet
>>> diff = urllib2.urlopen('')
>>> encoding = diff.headers.getparam('charset')
>>> patch = PatchSet(diff, encoding=encoding)
>>> patch
<PatchSet: [<PatchedFile: .gitignore>, <PatchedFile: unidiff/>, <PatchedFile: unidiff/>]>
>>> patch[0]
<PatchedFile: .gitignore>
>>> patch[0].is_added_file
>>> patch[0].added
>>> patch[1]
<PatchedFile: unidiff/>
>>> patch[1].added, patch[1].removed
(20, 11)
>>> len(patch[1])
>>> patch[1][2]
<Hunk: @@ 109,14 110,21 @@ def __repr__(self):>
>>> patch[2]
<PatchedFile: unidiff/>
>>> print patch[2]
--- a/unidiff/
+++ b/unidiff/
@@ -37,4 +37,3 @@
# - deleted line
# \ No newline case (ignore)
RE_HUNK_BODY_LINE = re.compile(r'^([- \+\\])')

Load unified diff data by instantiating PatchSet with a file-like object as argument, or using PatchSet.from_filename class method to read diff from file.

A PatchSet is a list of files updated by the given patch. For each PatchedFile you can get stats (if it is a new, removed or modified file; the source/target lines; etc), besides having access to each hunk (also like a list) and its respective info.

At any point you can get the string representation of the current object, and that will return the unified diff data of it.

As a quick example of what can be done, check bin/unidiff file.

Also, once installed, unidiff provides a command-line program that displays information from diff data (a file, or stdin). For example:

$ git diff | unidiff
------- +6 additions, -0 deletions

1 modified file(s), 0 added file(s), 0 removed file(s)
Total: 6 addition(s), 0 deletion(s)

Load a local diff file

To instantiate PatchSet from a local file, you can use:

>>> from unidiff import PatchSet
>>> patch = PatchSet.from_filename('tests/samples/bzr.diff', encoding='utf-8')
>>> patch
<PatchSet: [<PatchedFile: added_file>, <PatchedFile: modified_file>, <PatchedFile: removed_file>]>

Notice the (optional) encoding parameter. If not specified, unicode input will be expected. Or alternatively:

>>> import codecs
>>> from unidiff import PatchSet
>>> with'tests/samples/bzr.diff', 'r', encoding='utf-8') as diff:
...     patch = PatchSet(diff)
>>> patch
<PatchSet: [<PatchedFile: added_file>, <PatchedFile: modified_file>, <PatchedFile: removed_file>]>

Finally, you can also instantiate PatchSet passing any iterable (and encoding, if needed):

>>> from unidiff import PatchSet
>>> with open('tests/samples/bzr.diff', 'r') as diff:
...     data = diff.readlines()
>>> patch = PatchSet(data, encoding='utf-8')
>>> patch
<PatchSet: [<PatchedFile: added_file>, <PatchedFile: modified_file>, <PatchedFile: removed_file>]>
Release History

Release History

This version
History Node


History Node


History Node


History Node


History Node


History Node


Download Files

Download Files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

File Name & Checksum SHA256 Checksum Help Version File Type Upload Date
unidiff-0.5.5-py2.py3-none-any.whl (14.8 kB) Copy SHA256 Checksum SHA256 py2.py3 Wheel Jan 3, 2018
unidiff-0.5.5.tar.gz (12.4 kB) Copy SHA256 Checksum SHA256 Source Jan 3, 2018

Supported By

WebFaction WebFaction Technical Writing Elastic Elastic Search Pingdom Pingdom Monitoring Dyn Dyn DNS Sentry Sentry Error Logging CloudAMQP CloudAMQP RabbitMQ Heroku Heroku PaaS Kabu Creative Kabu Creative UX & Design Fastly Fastly CDN DigiCert DigiCert EV Certificate Rackspace Rackspace Cloud Servers DreamHost DreamHost Log Hosting