Skip to main content

Software Heritage Tarball Loader

Project description

SWH Tarball Loader

The Software Heritage Tarball Loader is in charge of ingesting the directory representation of the tarball into the Software Heritage archive.

Sample configuration

The loader's configuration will be taken from the default configuration file: ~/.config/swh/loader/tar.yml (you can choose a different path by setting the SWH_CONFIG_FILENAME environment variable).

This file holds information for the loader to work, including celery configuration:

working_dir: /home/storage/tmp/
storage:
  cls: remote
  args:
    url: http://localhost:5002/
celery:
task_modules:
    - swh.loader.tar.tasks
task_queues:
    - swh.loader.tar.tasks.LoadTarRepository

Local

Load local tarball directly from code or python3's toplevel:

# Fill in those
repo = '8sync.tar.gz'
tarpath = '/home/storage/tar/%s' % repo
origin = {'url': 'file://%s' % repo, 'type': 'tar'}
visit_date = 'Tue, 3 May 2017 17:16:32 +0200'
last_modified = 'Tue, 10 May 2016 16:16:32 +0200'
import logging
logging.basicConfig(level=logging.DEBUG)

from swh.loader.tar.tasks import load_tar
load_tar(origin=origin, visit_date=visit_date,
         last_modified=last_modified)

Remote

Load remote tarball is the same sample:

url = 'https://ftp.gnu.org/gnu/8sync/8sync-0.1.0.tar.gz'
origin = {'url': url, 'type': 'tar'}
visit_date = 'Tue, 3 May 2017 17:16:32 +0200'
last_modified = '2016-04-22 16:35'
import logging
logging.basicConfig(level=logging.DEBUG)

from swh.loader.tar.tasks import load_tar
load_tar(origin=origin, visit_date=visit_date,
         last_modified=last_modified)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

swh.loader.tar-0.0.41.tar.gz (11.4 kB view hashes)

Uploaded Source

Built Distribution

swh.loader.tar-0.0.41-py3-none-any.whl (27.9 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page