Skip to main content

Call git-annex commands from Python

Project description

This package lets you interact with git-annex from within Python. Necessary commands are executed using subprocess and use their batch versions whenever possible.

I’m developing this as needed, so feel free to ask if there’s any functionality you want me to implement.


  • Python 3

  • git-annex 6.20170101 (or later)

  • pygit2 0.24 (or later)


To create a git-annex repository from scratch:

>>> from pygit2 import init_repository
>>> from git_annex_adapter import init_annex

>>> init_repository('/path/to/repo')

>>> init_annex('/path/to/repo')

To wrap an existing git-annex repository:

>>> from git_annex_adapter.repo import GitAnnexRepo
>>> repo = GitAnnexRepo('/tmp/repo')

The GitAnnexRepo is a subclass of pygit2.Repository. Git-annex specific functionality is accessed via the annex property of it, which is a mapping object from git-annex keys to AnnexedFile objects:

>>> for key in repo.annex:
...     print(key)

>>> key = 'SHA256E-s3--2c26...'
>>> repo.annex[key]

You can also get a tree representation of any git tree-ish object with annexed file entries replaced with AnnexedFile objects:

>>> tree = repo.annex.get_file_tree() # treeish='HEAD'
>>> tree

>>> set(tree)
{'foo', 'bar', 'baz', 'README', 'directory'}

>>> tree['foo']

>>> tree['directory']

>>> tree['directory/file'] # or tree['directory']['file']
<pygit2.Blob object at 0x...>

The AnnexedFile objects can be used to access and manipulate information about a file.

The metadata property of the AnnexedFile is a mutable mapping object from fields to sets of values:

>>> foo = tree['foo']
>>> for field, values in foo.metadata:
...     print('{}: {}'.format(field, values))
author: {'me'}
numbers: {'1', '2', '3'}

>>> foo.metadata['numbers'] |= {'0'}
>>> foo.metadata['numbers'] -= {'3'}
>>> foo.metadata['numbers']
{'0', '2'}

>>> del foo.metadata['author']
>>> 'author' in foo.metadata

>>> foo.metadata['lastchanged']

Calling Processes

If you need low-level access to the git-annex processes, you can do it via the classes included in process module:

>>> from git_annex_adapter.process import ...

Subclasses of GitAnnexBatchProcess return relevant output (usually one line or a dict object) whenever called with a line of input. For example, git-annex metadata --batch --json:

>>> proc = GitAnnexMetadataBatchJsonProcess('/path/to/repo')
>>> proc(file='foo')
{..., 'key':'SHA256E-s3--2c26...', 'fields': ...}

>>> proc(file='foo', fields={'numbers': ['1', '2', '3']})
{..., 'key': ..., 'fields': {'numbers': ['1', '2', '3'], ...}}

Subclasses of GitAnnexRunner call a single program with different arguments. They return a subprocess.CompletedProcess when called, which captures stdout and stderr. For example, to run git-annex version:

>>> runner = GitAnnexVersionRunner('/path/to/repo')
>>> runner(raw=True)
CompletedProcess(..., stdout='6.20170101', stderr='')

>>> print(runner().stdout)
git-annex version: 6.20170101

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

git-annex-adapter-0.2.2.tar.gz (13.7 kB view hashes)

Uploaded source

Built Distribution

git_annex_adapter-0.2.2-py3-none-any.whl (26.3 kB view hashes)

Uploaded py3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page