Skip to main content

Software Heritage Tarball Loader

Project description

SWH Tarball Loader

The Software Heritage Tarball Loader is in charge of ingesting the directory representation of the tarball into the Software Heritage archive.

Sample configuration

The loader's configuration will be taken from the default configuration file: ~/.config/swh/loader/tar.yml (you can choose a different path by setting the SWH_CONFIG_FILENAME environment variable).

This file holds information for the loader to work, including celery configuration:

working_dir: /home/storage/tmp/
storage:
  cls: remote
  args:
    url: http://localhost:5002/
celery:
task_modules:
    - swh.loader.tar.tasks
task_queues:
    - swh.loader.tar.tasks.LoadTarRepository

Local

Load local tarball directly from code or python3's toplevel:

# Fill in those
repo = '8sync.tar.gz'
tarpath = '/home/storage/tar/%s' % repo
origin = {'url': 'file://%s' % repo, 'type': 'tar'}
visit_date = 'Tue, 3 May 2017 17:16:32 +0200'
last_modified = 'Tue, 10 May 2016 16:16:32 +0200'
import logging
logging.basicConfig(level=logging.DEBUG)

from swh.loader.tar.tasks import load_tar
load_tar(origin=origin, visit_date=visit_date,
         last_modified=last_modified)

Remote

Load remote tarball is the same sample:

url = 'https://ftp.gnu.org/gnu/8sync/8sync-0.1.0.tar.gz'
origin = {'url': url, 'type': 'tar'}
visit_date = 'Tue, 3 May 2017 17:16:32 +0200'
last_modified = '2016-04-22 16:35'
import logging
logging.basicConfig(level=logging.DEBUG)

from swh.loader.tar.tasks import load_tar
load_tar(origin=origin, visit_date=visit_date,
         last_modified=last_modified)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for swh.loader.tar, version 0.0.41
Filename, size File type Python version Upload date Hashes
Filename, size swh.loader.tar-0.0.41-py3-none-any.whl (27.9 kB) File type Wheel Python version py3 Upload date Hashes View hashes
Filename, size swh.loader.tar-0.0.41.tar.gz (11.4 kB) File type Source Python version None Upload date Hashes View hashes

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page