Skip to main content

Git collection utilities

Project description

gitcoll

Git collections/tools to handle several git repos. Special functionality for gitlab servers.

gcln2

gcln2 is a rewrite of gcln. It focuses on gitlab integration, and has a more clean and streamlined design.

setup

gcln2 needs a configuration file in the root of repos.

Example:

general:
    server:
        git_url: ssh://git@<my gitlab server>
        api_url: https://<my gitlab server>
        api_key: <my generated API key from the web interface>

    retries: 0              # if os.system fails, retry this many times.
    worker_threads: 10      # speed up gitlab communication with several threads. Set to 0 if threading gives issues

workspace:
    only_main: false        # if true: only checkout the main branch, do not create _branches/ subdirs
    valid_paths:            # list of regexps that are valid workspace paths. Can also replace paths.
        - .*                # all paths are valid for now
    config:                 # list of local config variables to be set in git
        user.name: <your git commit name>
        user.email: <your email adress>
    add_remote_uid: []                      # git add remote the uid to external barebone servers
    home_dir_as_login_name: true
    home_dir_prefix: home
    replace:
        home/jesper/: jr/
        home/jesper/private: private

rules:
    valid_branches: main|master|[fid]-.*    # which branches we should care about
    blacklist_branches: []                  # but disregard these
    blacklist_repo_uids: []                 # and these repos

Note that a special !include directive can be used.

gcln2.yaml

general:
    server: !include secrets.yaml
    retries: 0              # if os.system fails, retry this many times.

secrets.yaml

git_url: ssh://git@<my gitlab server>
api_url: https://<my gitlab server>
api_key: <my generated API key from the web interface>

Once a proper gcln2.yaml file has been created, just run gcln2 update to pull all repos from the server.

!!!!OLD!!!!!

Features

  • checkout/update/check status for all accessible repos/branches on a gitlab server (do not use this for official gitlab.com without restricting which repos to use)
  • handle a special _config branch which contains some metadata for the repository
  • sync two git servers
  • submodule uid handling/replace uid reference to correct url (requires patched git command)

Install

clone the project, then do a pip install .. Note that a recent pip is needed. Make sure your python scripts directory is in the path.

Tutorial: setup gitlab connection

  1. Create an gitlab API key in the gitlab web interface. Further down, the key is called XXXXXXXXX
  2. Create a root directory for your workspace. Example: ~/dev (following references assumes this root dir)
  3. Create a yaml file ~/dev/secrets.yaml with the following content (replace the obvious parts):
git_me_apikey: XXXXXXXXX
user_name: My Name
user_email: me@mymail.com
  1. Create a control yaml file ~/dev/gcln.yaml with the following content (replace the obvious parts):
main_server:
    git_url: ssh://git@my_gitlab_srv.com:22
    api_url: https://my_gitlab_srv.com
    type: gitlab
    private_token: !secrets secrets/git_me_apikey

# all under workspace is optional
workspace:
    sub_tree: "my_sub_tree_to_checkout"     # only checkout repos under this tree. default "", ie all.
    branches: false                         # default true - checkout branches under "_branches" sub-dir for every repo
    home_prefix: "home_dirs"                # where to put home directories. Default "home"
    
    # init repos with these credentials
    config: # define local config or similar that gcln will try to set to each workspace:
        "user.name": !secrets secrets/user_name
        "user.email": !secrets secrets/user_email
  1. run gcln --all update This will query all repos from the gitlab server, then clone them locally. All branches are put into sub-dir _branches

Tutorial: special uid-handling for submodules

background

as submodule paths are part of the history, it is an issue if a submodules are moved/renamed on the server. With this tool, the recommonded setup is:

  • have all submodules on the same server as the containing repository
  • always refer to submodules by relative adressing. Absolute URLs will always have an issue with renames/different access protocols etc.
  • create a special _config branch in the submodules which contain a unique identifer
  • Use the submodule-uid patches for git, so you can setup the callback hook to look up submodule URLs.

steps

  1. Install git with uid-patches (https://gitlab.com/jesrib/git-uid)
  2. Make sure gcln is installed.
  3. config the uid-hook, on windows that would be config --global submodule.uidHook "gcln uidHook"
  4. when creating a submodule, use the gcln cfg --set_rnd_uid to create the special uid branch. Warning, this pushes to the master too.
  5. when adding a submodule, either do as usual, or can also use git submodule add uid:xyz
  6. To change the .gitmodules file to refer to uids, use the gcln clean_gitmodules. This will rewrite .gitmodules so the relative url will match the current position on the server, and also add uids. This does not commit/push the changes.

Tutorial: sync two servers

Assuming you have one gitlab server and one raw git servers, the following will sync them.

  1. Create a sync-dir. This will contain control files and two copies of all repos. Referred to as "SYNC_DIR" below
  2. Create yaml file SYNC_DIR/sync.yaml, with the following:
servers:
    my_bare_srv:
        name: BareSrv
        repo_list: bare_repos.yaml      # see below for definition
    my_gitlab_srv:
        name: GL
        cfg_file: gl_cfg.yaml           # same as workspace cfg file, but only the "main_server:" section is needed
  1. Create the SYNC_DIR/gl_cfg.yaml:
main_server:
    type: gitlab
    git_url: ssh://git@my_gitlab_srv:my_port_nr
    api_url: https://my_gitlab_srv
    private_token: !secrets secrets/git_me_apikey
  1. Create SYNC_DIR/secrets.yaml:
git_me_apikey: <my-api-key>
  1. Create SYNC_DIR/bare_repos.yaml:
server:
    git_url: ssh://git-user@hostname.org/~/path_to_repos_dir
repos:
    # a dict with repo uid as keys, and git-repo.path under git_url as value
    00112233445566778899: 00112233445566778899.git      # ie, map repo uid 00112233445566778899 to path ssh://git-user@hostname.org/~/path_to_repos_dir/00112233445566778899.git
    next_uid: next_path
  1. Sync: gcln sync --pull SYNC_DIR

this will pull all repos from both servers, store them locally under SYNC_DIR, then sync locally, and finally push the changes back. The bare repo will simply just pull all repos as there is no way to optimize that. The Gitlab connection will use the API to get changed repos.

Tutorial: Create local cache of UIDs

If you don't want to checkout all workspaces, but still need the database to map UIDs to repo URL, then this can be done with gcln update_cache. This is for instance useful when using the git patches to redirect submodule URLs.

  1. Create a working dir "WORK_DIR".
  2. Create WORK_DIR/gcln.yaml:
main_server:
    type: gitlab
    api_url: https://my_gitlab_srv
    private_token: !secrets secrets/git_me_apikey
  1. Create WORK_DIR/secrets.yaml:
git_me_apikey: <my-api-key>
  1. In WORK_DIR:
gcln update_cache

this will create the .cache.yaml and .cache.pickle files which are the local cached versions of the UIDs.

OLD

Tutorial to setup a proxy to replace reference to submodules

gcln.py can be setup to listen to ssh connections, and replace references which ends with a known list of IDs. These IDs are taken from the _config branch in the repositories. The use case is for sub-modules. To be server location independant, relative paths for submodules is a must. However, relative paths breaks down if the owning repository is moved to another group on the gitlab server. Hence, this feature.

  1. On the server running gitlab, check which user-id the docker git user has with sudo docker exec -t gitlab grep git: /etc/passwd. Following assumes 998
  2. If this user-id is already used on the server, change it. Also change applicable files, but be careful to not change owner of the files inside the docker container: usermod -u 1234 foo
  3. Create a new user git with the same uid as the gitlab container user git: adduser --uid 998 git and passwd -d git
  4. Make sure the new git user is a member of the group docker (check /etc/group)
  5. Make sure that you can read the authorized_keys file as user git: git@my_server:~$ more /srv/gitlab/data/.ssh/authorized_keys
  6. Make sure the ssh daemon is up and running, and you can ssh to the server. Normally port 22, but any port is OK as long as the ssh daemon works.
  7. In /home/git, checkout gitcoll, so you have the script in /home/git/gitcoll/gcln.py. Verify that you can run it as user git (~/gitcoll/gcln.py).
  8. If it doesn't exist, create directory /home/git/.ssh. Pay attention to the correct rights (700).
  9. Also create a directory called /home/git/git_srv_info
  10. ~/git_srv_info$ /home/git/gitcoll/gcln.py conv_ssh_auth /srv/gitlab/data/.ssh/authorized_keys > ~/.ssh/authorized_keys
  11. Test to ssh to the server with one of the registered keys in gitlab. It should respond to something similar to:
PTY allocation request failed on channel 0
Bad cmd:
Connection to ribbe.se closed.

If not, there is some setup with ssh.

  1. Try to checkout/clone/push git repos in the normal way, but through the gcln gateway.
  2. Check the log file /home/git/git_srv_info/ssh_proxy.log
  3. Create /home/git/git_srv_info/gcln.yaml:
proxy_cfg:
    root: /home/git/git_srv_info
    input_auth: /srv/gitlab/data/.ssh/authorized_keys
    output_auth: /home/git/.ssh/authorized_keys
  1. Test from a computer with a valid user key: ssh <url> update_keys. This will ask the script to update the keys from gitlab. Ie, whenever a key is changed on the gitlab server, this needs to be done.

  2. For all submodules, make sure they have a unique ID assign in the _config branch, ie, go to a checkout workspace, and issue gcln.py cfg. If failing, create a uid. Either with gcln.py cfg --set_rnd_uid to get a random uid, or gcln.py cfg --set_uid <name>. Don't forget git push --all afterwards.

  3. Do ssh <url> update_map to update the server-local database of redirection. This should create, on the server, a file called /home/git/git_srv_info/.uid_repo_map.pickle. It can take quite a while.

  4. Test to clone via the uid: git clone ssh://git@<url>/ThisDoesntMatter/<uid set above>.git

References:

Add a new repo

  • create it as usual in gitlab interface
  • go to your workspace root, do a "gcln update_all" to get the new repo in the workspace
  • go to the new workspace:
  • it is good to first commit something into the master branch, as creating a _config branch first might give some issues.
  • to create a random uid for the project: gcln.py cfg --nofetch --set_uid git push origin _config
  • optionally, do a: \utv\gitcoll\gcln.py cfg --set_id_name git push origin _config
  • to update cache again, go to your workspace root, do a "gcln update_all" to get the new repo in the workspace

Internal design

Main concepts:

  • Git Server/ServerConnector
  • Workspace/checkout
  • Git Repository/Repo/Gitlab project
  • Repo UID
  • Branch
  • MemberGroup - if a group is a member of a Gitlab project, it is a potential extra checkout path
  • config file = gcln.yaml
  • ws collection: all workspaces
  • ws_root - path to start of all workspaces (ws collection).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gitcoll-0.0.18.tar.gz (48.3 kB view hashes)

Uploaded Source

Built Distribution

gitcoll-0.0.18-py3-none-any.whl (49.8 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page