Skip to main content

Git-like tool backed by Neo4j

Project description

neogit

PyPI Python versions CI Docs License

A Git-like tool for filesystems, backed by a Neo4j graph and pluggable object storage.

Neogit takes content-addressed Merkle-tree snapshots of a directory tree and stores them in two places:

  • Neo4j — the graph: commits, branches, trees, blobs, and their relationships
  • Object storage — the bytes: file contents addressed by their SHA-1 (local filesystem, MinIO, or S3 via Apache Libcloud)

This split makes filesystem state queryable as a graph (Cypher over commits, diff trees, walk history) while keeping file contents in cheap blob storage.

Demo

Snapshotting two real Debian container filesystems (bullseye → bookworm) — hashing and uploading ~5,700 files with live progress, then a full file-level diff of the upgrade:

neogit commit --gui snapshotting two Debian container filesystems and diffing the upgrade

…and the resulting Merkle graph in the Neo4j Browser:

Neo4j Browser showing a neogit Merkle tree with Branch, Commit, Tree, and Blob nodes

Where it's used

  • CLI tool — capture and diff filesystem snapshots from the command line
  • Python library — neogit captures the filesystem; your pipeline enriches the graph. Embed it to hang your own content-addressed sub-Merkle-trees off a Blob — anything you can hash — so your analysis dedups and diffs for free, exactly like the file bytes do. OSWatcher, for example, attaches extracted symbols, parsed structs, and Windows registry hives to neogit's Commit graph

Quickstart

Requirements: Python 3.10+, Docker, and Git. Neogit uses local object storage by default, so the minimal setup only needs Neo4j.

pipx install neogit
# or: python -m pip install neogit

# Start a local Neo4j database for the demo. Auth disabled is for local testing only.
docker run --rm --name neogit-neo4j \
  -p 7474:7474 -p 7687:7687 \
  -e NEO4J_AUTH=none \
  neo4j:5.26

In another terminal, snapshot a real project checkout:

git clone --depth 1 https://github.com/psf/requests.git neogit-demo-root

neogit init
neogit commit first-snapshot -r ./neogit-demo-root

Open the Neo4j browser at http://localhost:7474 and run:

MATCH (c:Commit)-[r]->(t:Tree) RETURN c, r, t LIMIT 25

CLI overview

neogit init                                    # initialize database constraints
neogit commit <name> -r <path>                 # snapshot a directory on the default branch
neogit diff <old_hash> <new_hash>              # compare two filesystem snapshots
neogit branch <name> <commit_hash>             # create a branch pointing at a commit hash

See docs/reference/cli.md for the full reference.

Use as a library

from pathlib import Path
from neogit.service import Neogit

git = Neogit()
git.init()
commit_hash = git.commit("snapshot-1", Path("/path/to/capture"))

The graph model (Commit, Branch, Tree, Blob, PluginRun) is exposed under neogit.model for downstream tools that want to attach their own nodes — see docs/reference/data-model.md.

Documentation

Full documentation lives under docs/ and follows the Divio framework:

  • Tutorial — your first snapshot in 5 minutes
  • How-to guides — recipes for specific tasks (MinIO, S3, diffs, embedding the library)
  • Reference — CLI flags, config keys, data model
  • Explanation — design rationale, Merkle layout, why Neo4j

To preview the docs locally:

poetry install --with docs       # one-time, installs mkdocs into the venv
poetry run poe docs_serve        # equivalent to: poetry run mkdocs serve

Development

poetry install
poetry run poe ccode      # fmt + lint + type-check
poetry run poe test       # full test suite

See docs/how-to/contributing.md for the full dev workflow.

License

Licensed under the Apache License 2.0. You're free to use, modify, and distribute neogit, including commercially, provided you preserve the copyright and license notices (see NOTICE).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

neogit-0.16.0.tar.gz (43.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

neogit-0.16.0-py3-none-any.whl (61.2 kB view details)

Uploaded Python 3

File details

Details for the file neogit-0.16.0.tar.gz.

File metadata

  • Download URL: neogit-0.16.0.tar.gz
  • Upload date:
  • Size: 43.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.2.1 CPython/3.11.15 Linux/6.17.0-1015-azure

File hashes

Hashes for neogit-0.16.0.tar.gz
Algorithm Hash digest
SHA256 6d7776888ee82da584973c436f67667abf4511a275e0b6c04a227d7796601943
MD5 c99b748f4aaa2c93a76348489c9d3312
BLAKE2b-256 40c2fac4285893b2bd6893f7e91814dc1e3651b3a117077af9fafd13539fd202

See more details on using hashes here.

File details

Details for the file neogit-0.16.0-py3-none-any.whl.

File metadata

  • Download URL: neogit-0.16.0-py3-none-any.whl
  • Upload date:
  • Size: 61.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.2.1 CPython/3.11.15 Linux/6.17.0-1015-azure

File hashes

Hashes for neogit-0.16.0-py3-none-any.whl
Algorithm Hash digest
SHA256 7ff561bddd6e30e1459358fa1d7815311fce5ee327ec998cd060e927f612a416
MD5 122c69e9462a59f407bd4949efc6c96f
BLAKE2b-256 8659729241c9fc3ea5ac75ae53e0eaa4d99d076677b1ca939d465d457767c27c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page