Skip to main content

Software Heritage git loader

Project description

The Software Heritage Git Loader is a tool and a library to walk a local Git repository and inject into the SWH dataset all contained files that weren’t known before.

The main entry points are:

  • swh.loader.git.loader.GitLoader for the main loader which can ingest either local or remote git repository’s contents. This is the main implementation deployed in production.

  • swh.loader.git.from_disk.GitLoaderFromDisk which ingests only local git clone repository.

  • swh.loader.git.loader.GitLoaderFromArchive which ingests a git repository wrapped in an archive.

  • swh.loader.git.directory.GitCheckoutLoader which ingests a git tree at a specific commit, branch or tag.

License

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

See top-level LICENSE file for the full text of the GNU General Public License along with this program.

Dependencies

### Runtime

  • python3

  • python3-dulwich

  • python3-retrying

  • python3-swh.core

  • python3-swh.model

  • python3-swh.storage

  • python3-swh.scheduler

### Test

  • python3-nose

Requirements

  • implementation language, Python3

  • coding guidelines: conform to PEP8

  • Git access: via dulwich

CLI Run

You can run the loader from a remote origin (loader) or from an origin on disk (from_disk) directly by calling:

swh loader -C <config-file> run git <git-repository-url>

or “git_disk”.

## Configuration sample

/tmp/git.yml:

storage:
  cls: remote
  args:
    url: http://localhost:5002/

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

swh_loader_git-2.14.3.tar.gz (88.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

swh_loader_git-2.14.3-py3-none-any.whl (83.3 kB view details)

Uploaded Python 3

File details

Details for the file swh_loader_git-2.14.3.tar.gz.

File metadata

  • Download URL: swh_loader_git-2.14.3.tar.gz
  • Upload date:
  • Size: 88.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.12

File hashes

Hashes for swh_loader_git-2.14.3.tar.gz
Algorithm Hash digest
SHA256 40ca33a30069f9c27d6d051107cc57dfcbc7f73cddb181bf58a3b14588626178
MD5 091db3400624f77a435b1cfdad23f682
BLAKE2b-256 a36bc3f6bc30705153e656be5d4e19d9f73ce5c5c1265c07de54746199e98366

See more details on using hashes here.

File details

Details for the file swh_loader_git-2.14.3-py3-none-any.whl.

File metadata

File hashes

Hashes for swh_loader_git-2.14.3-py3-none-any.whl
Algorithm Hash digest
SHA256 5012df7434b8a60d7de5e5aaa887362e94fb292f3ef81e55689d70a98595d2d9
MD5 b2ebb8c2f1d7203c990ce576a5c53eb5
BLAKE2b-256 a674363a2f4df9b9c107c3e5a17c3f8cd203b0f6dcb6004f47683fb4d8dfd6d2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page