Skip to main content

Python functions and CLI to mirror git repositories available on HTTP(S) to S3

Project description

mirror-git-to-s3 Build Status

Python functions and CLI to mirror git repositories available on HTTP(S) to S3. Essentially converts smart protocol git repositories to the so-called dumb protocol. Does not use temporary disk space, and uses streaming under the hood. This should allow the mirroring to be run on systems that don't have much disk or available memory, even on large repositories. However, at the time of writing large repositories can be slow to mirror.

Installation

pip install mirror-git-to-s3

Usage

To mirror one or more repositories from Python, use the mirror_repos function, passing it an iterable of (source, target) mappings.

from mirror_git_to_s3 import mirror_repos

mirror_repos((
    ('https://example.test/my-first-repo', 's3://my-bucket/my-first-repo'),
    ('https://example.test/my-second-repo', 's3://my-bucket/my-second-repo'),
))

In the previous example the iterable is itself a tuple. However in general any iterable is supported, Under the hood repositories are processed in parallel, and transfers can start before the entire list is known.

from mirror_git_to_s3 import mirror_repos

def mappings():
    yield ('https://example.test/my-first-repo', 's3://my-bucket/my-first-repo')
    yield ('https://example.test/my-second-repo', 's3://my-bucket/my-second-repo')

mirror_repos(mappings())

Under the hood, boto3 is used to communicate with S3. The boto3 client is constructed automatically, but you can override the default by using the get_s3_client argument.

import boto3
from mirror_git_to_s3 import mirror_repos

mirror_repos(mappings(), get_s3_client=lambda: boto3.client('s3'))

This can be used to mirror to S3-compatible storage.

import boto3
from mirror_git_to_s3 import mirror_repos

mirror_repos(mappings(), get_s3_client=lambda: boto3.client('s3', endpoint_url='http://my-host.com/'))

To mirror repositories from the the command line pairs of --source --target options can be passed to mirror-git-to-s3.

mirror-git-to-s3 \
    --source 'https://example.test/my-first-repo' --target 's3://my-bucket/my-first-repo' \
    --source 'https://example.test/my-second-repo' --target 's3://my-bucket/my-second-repo'

At the time of writing, there is no known standard way of discovering a set of associated git repositories, hence to remain general, this project must be told the source and target addresses of each repository explicitly.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mirror_git_to_s3-0.0.6.tar.gz (6.8 kB view details)

Uploaded Source

Built Distribution

mirror_git_to_s3-0.0.6-py3-none-any.whl (6.8 kB view details)

Uploaded Python 3

File details

Details for the file mirror_git_to_s3-0.0.6.tar.gz.

File metadata

  • Download URL: mirror_git_to_s3-0.0.6.tar.gz
  • Upload date:
  • Size: 6.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.6

File hashes

Hashes for mirror_git_to_s3-0.0.6.tar.gz
Algorithm Hash digest
SHA256 33071b94a3ed6633068d02d0e06dc4ff3290bb80d46f5befee3ae8b9886d159b
MD5 a04ef1236eaf6952088f863e565498e9
BLAKE2b-256 f763a9e1ffddfbc3d643ed5a4dea35d812ad99c67973075c16d1d8f99a1ed0fd

See more details on using hashes here.

File details

Details for the file mirror_git_to_s3-0.0.6-py3-none-any.whl.

File metadata

File hashes

Hashes for mirror_git_to_s3-0.0.6-py3-none-any.whl
Algorithm Hash digest
SHA256 2661e5866f4cc8c06189e5e73b34923532733c6baa381f29e7591453121936de
MD5 94b68f217e9907b9adc1b730e21b6964
BLAKE2b-256 d19a44ed160ad6331c506d30885d876b902faafe36fa1497efadfae4ebdbd883

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page