Software Heritage Directory Loader
Project description
SWH-loader-dir
The Software Heritage Directory Loader is a tool and a library.
Its sole purpose is to walk a local directory and inject into the SWH dataset all unknown contained files from that directory structure.
Configuration
The loader needs a configuration file in {/etc/softwareheritage | ~/.config/swh | ~/.swh}
/loader/dir.yml.
This file should be similar to this (adapt according to your needs):
storage:
cls: remote
args:
url: http://localhost:5002/
Run
To run the loader, you can use either:
- python3's toplevel
- celery
Toplevel
Load directory directly from code or toplevel:
dir_path = '/home/storage/dir/'
# Fill in those
origin = {'url': 'some-origin', 'type': 'dir'}
visit_date = 'Tue, 3 May 2017 17:16:32 +0200'
revision = {
'author': {'name': 'some', 'fullname': 'one', 'email': 'something'},
'committer': {'name': 'some', 'fullname': 'one', 'email': 'something'},
'message': '1.0 Released',
'date': None,
'committer_date': None,
'type': 'tar',
'metadata': {}
}
import logging
logging.basicConfig(level=logging.DEBUG)
from swh.loader.dir.tasks import LoadDirRepository
l = LoadDirRepository()
l.run_task(dir_path=dir_path, origin=origin, visit_date=visit_date,
revision=revision, release=None, branch_name='master')
Celery
To use celery, add the following entries in the
{/etc/softwareheritage | ~/.config/swh | ~/.swh}
/worker.yml` file:
task_modules:
- swh.loader.dir.tasks
task_queues:
- swh_loader_dir
cf. swh-core's documentation for more details.
You can then send the following message to the task queue:
from swh.loader.dir.tasks import LoadDirRepository
# Fill in those
origin = {'url': 'some-origin', 'type': 'dir'}
visit_date = 'Tue, 3 May 2017 17:16:32 +0200'
release = None
revision = {}
occurrence = {}
# Send message to the task queue
LoaderDirRepository().run(('/path/to/dir', origin, visit_date, revision, release, [occurrence]))
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for swh.loader.dir-0.0.33-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8b51abcb8a71c03267bd9b5c02aa9ed6f09a831a94945da0625ec1fed9271640 |
|
MD5 | 171fcdf7770017e684d9341ee7d67fd4 |
|
BLAKE2b-256 | df1921c3153740e098152aece8a34d2ed278ec4e57b4458ab66e3091b4d4cdac |