Skip to main content

Software Heritage storage manager

Project description


Abstraction layer over the archive, allowing to access all stored source code artifacts as well as their metadata.

See the documentation for more details.


Python tests for this module include tests that cannot be run without a local Postgres database. You are not obliged to run those tests though:

  • make test: will run all tests
  • make test-nodb: will run only tests that do not need a local DB
  • make test-db: will run only tests that do need a local DB

If you do want to run DB-related tests, you should ensure you have access zith sufficient privileges to a Postgresql database.

Using your system database

You need to ensure that your user is authorized to create and drop DBs, and in particular DBs named "softwareheritage-test" and "softwareheritage-dev"

Note: the testdata repository (swh-storage-testdata) is not required any more.

Using pifpaf

pifpaf is a suite of fixtures and a command-line tool that allows to start and stop daemons for a quick throw-away usage.

It can be used to run tests that need a Postgres database without any other configuration reauired nor the need to have special access to a running database:

$ pifpaf run postgresql make test-db
Ran 124 tests in 56.203s


Note that pifpaf is not yet available as a Debian package, so you may have to install it in a venv.


A test server could locally be running for tests.

Sample configuration

In either /etc/softwareheritage/storage/storage.yml, ~/.config/swh/storage.yml or ~/.swh/storage.yml:

  cls: local
    db: "dbname=softwareheritage-dev user=<user>"
      cls: pathslicing
        root: /home/storage/swh-storage/
        slicing: 0:2/2:4/4:6

which means, this uses:

  • a local storage instance whose db connection is to softwareheritage-dev local instance

  • the objstorage uses a local objstorage instance whose:

    • root path is /home/storage/swh-storage

    • slicing scheme is 0:2/2:4/4:6. This means that the identifier of the content (sha1) which will be stored on disk at first level with the first 2 hex characters, the second level with the next 2 hex characters and the third level with the next 2 hex characters. And finally the complete hash file holding the raw content. For example: 00062f8bd330715c4f819373653d97b3cd34394c will be stored at 00/06/2f/00062f8bd330715c4f819373653d97b3cd34394c

Note that the 'root' path should exist on disk.

Run server


python3 -m ~/.config/swh/storage.yml

This runs a local swh-storage api at 5002 port.

And then what?

In your upper layer (loader-git, loader-svn, etc...), you can define a remote storage with this snippet of yaml configuration.

  cls: remote
    url: http://localhost:5002/

You could directly define a local storage with the following snippet:

  cls: local
    db: service=swh-dev
      cls: pathslicing
        root: /home/storage/swh-storage/
        slicing: 0:2/2:4/4:6

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Filename, size & hash SHA256 hash help File type Python version Upload date (124.1 kB) Copy SHA256 hash SHA256 Wheel py3 (157.3 kB) Copy SHA256 hash SHA256 Source None

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN SignalFx SignalFx Supporter DigiCert DigiCert EV certificate StatusPage StatusPage Status page