Scrape metadata from CVMFS Stratum servers.
Project description
CVMFS server scraper and prometheus exporter
This tool scrapes the public metadata sources from set of stratum0 and stratum1 servers. It grabs:
- cvmfs/info/v1/repositories.json
And then for every repo it finds (that it's not told to ignore), it grabs:
- cvmfs/<repo>/.cvmfs_status.json
- cvmfs/<repo>/.cvmfspublished
Usage
#!/usr/bin/env python3
from cvmfsscraper.main import scrape, scrape_server
# server = scrape_server("aws-eu-west1.stratum1.cvmfs.eessi-infra.org")
servers = scrape(
servers = [
"aws-eu-west1.stratum1.cvmfs.eessi-infra.org",
"bgo-no.stratum1.cvmfs.eessi-infra.org",
],
ignore_repos = [
"ci.eessi-hpc.org",
],
)
print(servers[0])
for repo in servers[0].repositories:
print("Repo: " + repo.name )
print("Root size: " + repo.root_size)
print("Revision: " + repo.revision)
print("Revision timestamp: " + repo.revision_timestamp)
print("Last snapshot: " + str(repo.last_snapshot))
Data structure
Server
A server object, representing a specific server that has been scraped.
servers = scrape(...)
server_one = servers[0]
Name
Type: Attribute
server.name
Returns
The name of the server, usually its fully qualified domain name.
GeoApi status
Type: Attribute
server.geoapi_status
Returns
An integer value within [0, 1, 2, 9]
, with the following meaning:
- 0 : OK
- 1 : GeoApi gives wrong location
- 2 : No response
- 9 : The server has no repository available so the GeoApi cannot be tested
Repositories
Type: attribute
server.repositories
Returns
A list of repository objects, empty if no repositores are scraped on the server.
Ignored repositories
Type: Attribute
server.ignored_repositories
Returns:
List of repositories names that are to be ignored by the scraper.
Forced repositories
Type: Attribute
server.forced_repositories
Returns
A list of repository names that the server is forced to scrape. If a repo name exists in both ignored_repositories and forced_repositories, it will be scraped.
Repository
A repository object, representing a single repository on a scraped server.
servers = scrape(...)
repo_one = servers[0].repositories[0]
Name
Type: Attribute
repo_one.name
Returns
The fully qualified name of the repository.
Server
Type: Attribute
repo_one.server
Returns
The server object to which the repository belongs.
Path
Type: Attribute
repo_one.path
Returns
The path for the repository on the server. May differ from the name. To get a complete URL, one can do:
url = "http://" + repo_one.server.name + repo_one.path
Status attributes:
These attributes are populated from cvmfs_status.json
:
Attribute | Value |
---|---|
last_gc | Timestamp of last garbage collection |
last_snapshot | Timestamp of the last snapshot |
Information from .cvmfspublished
is also provided. For explanations for these keys, please see CMVFS' official documentation. The field value in the table is the field key from .cvmfspublished
.
Attribute | Field |
---|---|
alternative_name | A |
full_name | N |
is_garbage_collectable | G |
metadata_cryptographic_hash | M |
micro_cataogues | L |
reflog_checksum_cryptographic_hash | Y |
revision_timestamp | T |
root_catalogue_ttl | D |
root_cryptographic_hash | C |
root_size | B |
root_path_hash | R |
signature | The end signature blob |
signing_certificate_cryptographic_hash | X |
tag_history_cryptographic_hash | H |
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for cvmfs-server-scraper-0.0.1.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4d0e4393b1aeb19d59ca11eb0c7a5ac08b0c70bad9faf1946c0587686162bb4c |
|
MD5 | 2a00738ebb0aaa06bf93c394b3f0087a |
|
BLAKE2b-256 | 28c9e19c1117d5517fc503da74a801f3541687bf52feae9cbb72606643882681 |
Hashes for cvmfs_server_scraper-0.0.1-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 503be0520403bdcf73413a7c36e19da7e8df96148ac19faba648db4369d4e390 |
|
MD5 | d75dcad376c2e49d8b9e9e21dca1787d |
|
BLAKE2b-256 | 9605bfd56650e0b8901d3f4523ae6ed402ba1c69cbf50da63dc4f9f25eddb828 |