Skip to main content

Software Heritage git loader

Project description

The Software Heritage Git Loader is a tool and a library to walk a local Git repository and inject into the SWH dataset all contained files that weren’t known before.

The main entry points are:

  • swh.loader.git.loader.GitLoader for the main loader which can ingest either local or remote git repository’s contents. This is the main implementation deployed in production.

  • swh.loader.git.from_disk.GitLoaderFromDisk which ingests only local git clone repository.

  • swh.loader.git.loader.GitLoaderFromArchive which ingests a git repository wrapped in an archive.

  • swh.loader.git.directory.GitCheckoutLoader which ingests a git tree at a specific commit, branch or tag.

License

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

See top-level LICENSE file for the full text of the GNU General Public License along with this program.

Dependencies

### Runtime

  • python3

  • python3-dulwich

  • python3-retrying

  • python3-swh.core

  • python3-swh.model

  • python3-swh.storage

  • python3-swh.scheduler

### Test

  • python3-nose

Requirements

  • implementation language, Python3

  • coding guidelines: conform to PEP8

  • Git access: via dulwich

CLI Run

You can run the loader from a remote origin (loader) or from an origin on disk (from_disk) directly by calling:

swh loader -C <config-file> run git <git-repository-url>

or “git_disk”.

## Configuration sample

/tmp/git.yml:

storage:
  cls: remote
  args:
    url: http://localhost:5002/

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

swh_loader_git-2.13.1.tar.gz (92.1 kB view details)

Uploaded Source

Built Distribution

swh_loader_git-2.13.1-py3-none-any.whl (88.2 kB view details)

Uploaded Python 3

File details

Details for the file swh_loader_git-2.13.1.tar.gz.

File metadata

  • Download URL: swh_loader_git-2.13.1.tar.gz
  • Upload date:
  • Size: 92.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.9

File hashes

Hashes for swh_loader_git-2.13.1.tar.gz
Algorithm Hash digest
SHA256 528913802f4ba2387832770664edf559c2c0a47dbf8fda7e0006e5980a6a677f
MD5 491013d24c2da15128a76b83c5650c81
BLAKE2b-256 f64642b4613a81ac6c0a43671cebe82cb708e97d860c6631a208d5eb64517214

See more details on using hashes here.

File details

Details for the file swh_loader_git-2.13.1-py3-none-any.whl.

File metadata

File hashes

Hashes for swh_loader_git-2.13.1-py3-none-any.whl
Algorithm Hash digest
SHA256 2686154a5d69f2618b05fcb26b1c5c4637e25335b844d6570687865749e79ec9
MD5 4c5843add60bcef7eca0d8a90a939be5
BLAKE2b-256 007ca620d98ad931219e7e044e1db33c67682ed94355fff7dbfcdac25addd871

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page