Git Parent multirepo management utility
Project description
Installation
To install via pip, run:
pip install gitparent
To install from the repo, run:
git clone https://github.com/gitparent/gitparent.git
cd gitparent
pip install .
After installation, you can run the utility by using the gitp
command.
Python 3.9+, pyyaml, and filelock are required.
About
gitparent is largely based off of meta and gitman, both of which are lightweight layers on top of git which facilitate and manage projects consisting of nested git repos. Rather than adding complexity at the git level (e.g. git subtree, git submodule) or adding a heavyweight tooling layer with its own paradigms (e.g. git-repo), gitparent opts for the meta/gitman approach: provide a thin multi-repo management layer and let git shine.
Why gitparent rather than meta or gitman? It boils down to preference, but here are some of the key differentiators:
- Simple hierarchical status querying via
gitp status
(absent in both meta and gitman) - Simple manifest format to minimize git conflict resolution (lacking in gitman)
- General purpose utility operations for child repos (lacking in gitman)
- Rev control for child repos (lacking in meta)
- Built-in external linking mechanic (absent/lacking in meta and gitman)
- Link "overlaying" to override shared repo dependencies (absent/lacking in meta and gitman)
- Favors Python projects due to being written in Python (i.e. one less dependency; meta is written in nodeJS)
Note that gitparent is intended for use with trusted repositories/secure environments only due to the ability for arbitrary code to be executed.
Purpose
- Support all work modes described in the table in the Philosophy section.
- Help user manage changeset distribution.
- Track changes to determine when particular changesets (possible across multiple repos) are made.
Philosophy
The following table represents the progression of multi-repo/multi-dependency projects in order of project maturity and describes the optimal form the dependencies take at each stage in a development environment.
Stage | Type of repo/project | Ideal Source of Depencencies |
---|---|---|
1 | Immature/unstable projects | unversioned dependency repos cloned onsite (git multirepo) |
2 | Semi-mature/stable projects | package managers (local modifications only or read-only) -or- versioned (tag, branch) clones of dependency repos onsite (git multirepo) |
3a | Mature/stable projects | package managers (local modifications only or read-only) |
3b | Large, packaged IP | (sym)links, fileshares (read-only, no local copies made) |
In stage 1, we have an intention to break our project into multiple repositories, but the speed at which changes are being made is so great that maintaining versions for each dependency across repos doesn't justify the cost. Stage 1 lends itself to a sort of "virtual monorepo" work flow wherein the project is technically composed of multiple repositories but functions as a singular repository. As soon as the project reaches a stage wherein continuous integration becomes sufficiently complex and the number of collaborators and/or level of autonomy of each individual child repository increases, the project would be best served by moving to either stage 2 or 3.
In stage 2, the project is somewhat mature and each child repository has some level of autonomy (folks are contributing to and operating at the child repo level rather than always at the top level, development of individual child repos is driven by different timelines/external factors). Versioned git repositories may be favorable in the case wherein occasional local development across multiple repos is required. For repo relationships that do not have this requirement or require some pre-generated collateral to be present at the time of consumption (e.g. any generation process that cannot/should not be reproduced by consumers of a dependency), package managers may be a better fit to allow for local copies of pre-packaged dependencies to be downloaded in an ephemeral store locally in the developer's workspace/environment.
In stage 3, the project has reached a level of maturity that warrants a strict release and integration process between all dependent repos in the project. This can either be achieved via the aforementioned package manager model (3a, downloading a local copy of pre-packaged dependency content), or for dependencies that take up significant disk space, via logical links to a static path within a shared compute environment (3b).
gitparent attempts to provide a full solution to 1, 2, and 3b in the table above, and seeks to enable integration of package managers for 2/3a.
Linking
gitparent provides ways to describe child repos as links to support the following usecases:
- A common dependency exists across multiple child repos which should all be the same version. Linking them all to one source lets developers make changes to that common dependency in one place for the whole project. Link overlaying would be used in this case.
- In a shared compute environment, dependencies can be linked to static read-only paths. This is helpful if a project contains dependencies that are very large or are installed statically in a compute environment. Normal links or link overlaying may apply in this case.
The difference between a normal link and an overlay link is that normal links are stored as state at the parent level of the target of that link whereas overlays are stored as state only at the top level repo. Link overlays are ignored if the repo in question is not the top-most repo. Take the following example:
repo A
|
|- repo child_of_A
|- repo grandchild_of_A
If we were to create an ordinary link for grandchild_of_A
to some static path in our system from repo A
, that link information would be stored within the manifest of child_of_A
(the parent of grandchild_of_A
). This means that if we commit that change and then cloned child_of_A
independently, we would see grandchild_of_A
as the link we created.
If we were to create an overlay link for grandchild_of_A
to some static path in our system (or to some other child repo that falls under repo A
), that link information would be stored within the manifest of A
. If we were to commit that change and clone a fresh copy of child_of_A
, we would not see a link created for grandchild_of_A
. We would only see that link created if we cloned A
. Furthermore, if A
itself is a child repo to some other, higher-order repo, that repo doesn't apply a link overlay to grandchild_of_A
, and we cloned the higher-order repo, we again would not see a link for grandchild_of_A
since overlays are only evalutated at the level in which they were created (i.e. A
).
Schema
The format of the .gitp_manifest
file which stores gitparent state information is as follows:
variables:
SOME_VARIABLE: variable_default_value
#(more variables)
repos:
<path to instance of child repo>:
url: <repo URL>
username: <optional username>
password: <optional password>
branch: <branch or tag to track>
commit: <commit SHA to track -- takes precedence over branch if both are specified>
type: <repo|overlay>
#(more repo entries)
post_clone:
- <first system command to execute upon doing a `gitp clone` on this repo>
- <second system command "">
#(more commands)
post_pull:
- <first system command to execute upon doing a `gitp pull` on this repo>
- <second system command "">
#(more commands)
The variables
section specifies the default values of interpolated environment variables in the manifest. If the variable is not set, all mentions of said variable within the manifest file will use the supplied default value. If no default is supplied, the variable is interpreted as a literal string. Entries for this section and interpolation of said variables must be added by manually editing the manifest file.
The username
and password
fields for each child repo entry are optional. If a username or username and password are already specified in the url
of the repo, username
and password
will take precedence if specified/not empty. These fields are ignored if empty. Note that these are populated if the username/password are specified in the url
(--from
option of gitp new
).
The commands listed under post_clone
and post_sync
are run in the order specified and in the root repo directory. As the names suggest, post_clone
is triggered after a gitp clone
, after the associated repo has been cloned (but not its children). Similarly, post_pull
is triggered after all children of a given repo have been pulled via gitp pull
(but before overlays are applied). Entries to these sections must be added by manually editing the manifest file.
The GITP_PARENT_REPO
environment variable is set during gitp pull
and gitp clone
operations to communicate to any processes invoked via post_clone
or post_sync
whether or not the current repo is being consumed as a parent repo or as a child repo. This is useful if you wish to execute certain commands/run certain processes contingent on how the repo is being consumed.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for gitparent-20220727-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 863bc11f06b2d2f7f05247fc0a83cea2b7441196a979de95660eae7441a549f4 |
|
MD5 | 8cff09c59ff89cb01e4f4de61c78e2b7 |
|
BLAKE2b-256 | 4bd12f94eb3b4892eb2e9c9c423aa469ed73c29903c1476361eedcb6eb7887f9 |