Scrape metadata from CVMFS Stratum servers.
Project description
CVMFS server scraper and prometheus exporter
This tool scrapes the public metadata sources from set of stratum0 and stratum1 servers. It grabs:
- cvmfs/info/v1/repositories.json
And then for every repo it finds (that it's not told to ignore), it grabs:
- cvmfs/<repo>/.cvmfs_status.json
- cvmfs/<repo>/.cvmfspublished
Usage
#!/usr/bin/env python3
from cvmfsscraper import scrape, scrape_server
# server = scrape_server("aws-eu-west1.stratum1.cvmfs.eessi-infra.org")
servers = scrape(
servers = [
"aws-eu-west1.stratum1.cvmfs.eessi-infra.org",
"bgo-no.stratum1.cvmfs.eessi-infra.org",
],
ignore_repos = [
"ci.eessi-hpc.org",
],
)
# Note that the order of servers is undefined.
print(servers[0])
for repo in servers[0].repositories:
print("Repo: " + repo.name )
print("Root size: " + repo.root_size)
print("Revision: " + repo.revision)
print("Revision timestamp: " + repo.revision_timestamp)
print("Last snapshot: " + str(repo.last_snapshot))
Data structure
Server
A server object, representing a specific server that has been scraped.
servers = scrape(...)
server_one = servers[0]
Name
Type: Attribute
server.name
Returns
The name of the server, usually its fully qualified domain name.
GeoApi status
Type: Attribute
server.geoapi_status
Returns
A GeoAPIstatus enum object. Defined in constants.py
. The possible values are:
- OK (0: OK)
- LOCATION_ERROR (1: GeoApi gives wrong location)
- NO_RESPONSE (2: No response)
- NOT_FOUND (9: The server has no repository available so the GeoApi cannot be tested)
- NOT_YET_TESTED (99: The server has not yet been tested)
Repositories
Type: attribute
server.repositories
Returns
A list of repository objects, sorted by name. Empty if no repositores are scraped on the server.
Ignored repositories
Type: Attribute
server.ignored_repositories
Returns
List of repositories names that are to be ignored by the scraper.
Forced repositories
Type: Attribute
server.forced_repositories
Returns
A list of repository names that the server is forced to scrape. If a repo name exists in both ignored_repositories and forced_repositories, it will be scraped.
Repository
A repository object, representing a single repository on a scraped server.
servers = scrape(...)
repo_one = servers[0].repositories[0]
Name
Type: Attribute
repo_one.name
Returns
The fully qualified name of the repository.
Server
Type: Attribute
repo_one.server
Returns
The server object to which the repository belongs.
Path
Type: Attribute
repo_one.path
Returns
The path for the repository on the server. May differ from the name. To get a complete URL, one can do:
url = "http://" + repo_one.server.name + repo_one.path
Status attributes
These attributes are populated from cvmfs_status.json
:
Attribute | Value |
---|---|
last_gc | Timestamp of last garbage collection |
last_snapshot | Timestamp of the last snapshot |
Information from .cvmfspublished
is also provided. For explanations for these keys, please see CVMFS' official documentation. The field value in the table is the field key from .cvmfspublished
.
Attribute | Field |
---|---|
alternative_name | A |
full_name | N |
is_garbage_collectable | G |
metadata_cryptographic_hash | M |
micro_cataogues | L |
reflog_checksum_cryptographic_hash | Y |
revision_timestamp | T |
root_catalogue_ttl | D |
root_cryptographic_hash | C |
root_size | B |
root_path_hash | R |
signature | The end signature blob |
signing_certificate_cryptographic_hash | X |
tag_history_cryptographic_hash | H |
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for cvmfs_server_scraper-0.0.2.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3ff63e2878083c22874f648ce8fb974a7fd2ace843365fb7ae744c073d7a349f |
|
MD5 | 25e4671c3369e8c2f1153f5fe7efc98d |
|
BLAKE2b-256 | 4186e11b3ea5d5446f36793148071f203ec8c04ae0e8bdce8615a2d855b5ac14 |
Hashes for cvmfs_server_scraper-0.0.2-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 45626592ab494e53f386029f17efbb2df7e24a52624bd315a6c808ffdfe6118e |
|
MD5 | f586dfc08d8020262d2fcf74fe03ab52 |
|
BLAKE2b-256 | 3037393749cb22404eea949063ccc8fa7920739d95277fdb29bdd02f7336ec83 |