No project description provided
Project description
python-grid5000 is a python package wrapping the Grid’5000 REST API. You can use it as a library in your python project or you can explore the Grid’5000 resources interactively using the embedded shell.
1 API compatibility
Client version |
API version |
---|---|
1.x |
3.x (stable) |
2 Thanks
The core code is borrowed from python-gitlab with small adaptations to conform with the Grid5000 API models (with an ’s’!)
3 Contributing
To contribute, you can drop me an email or open an issue for a bug report, or feature request.
There are many areas where this can be improved some of them are listed here:
The complete coverage of the API isn’t finished (yet) but this should be fairly easy to reach. Most of the logic go in `grid5000.objects <https://gitlab.inria.fr/msimonin/python-grid5000/blob/master/grid5000/objects.py>`_. And to be honnest I only implemented the feature that I needed the most.
Returned status code aren’t yet well treated.
4 Comparison with …
RESTfully: It consumes REST API following the HATEOAS principles. This allows the client to fully discover the resources and actions available. Most of the G5K API follow theses principles but, for instance the Storage API don’t. Thus RESTfully isn’t compatible with all the features offered by the Grid’5000 API. It’s a ruby library. Python-grid5000 borrows the friendly syntax for resource browsing, but in python.
Execo: Written in Python. The api module gathers a lot of utils functions leveraging the Grid’5000 API. Resources aren’t exposed in a syntax friendly manner, instead functions for some classical operations are exposed (mainly getters). It has a convenient way of caching the reference API. Python-grid5000 is a wrapper around the Grid’5000 that seeks 100% coverage. Python-grid5000 is resource oriented.
Raw requests: The reference for HTTP library in python. Good for prototyping but low-level. python-grid5000 encapsulates this library.
5 Installation and examples
Please refer to https://api.grid5000.fr/doc/4.0/reference/spec.html for the complete specification.
All the examples are exported in the examples subdirectory so you can easily test and adapt them.
The configuration is read from a configuration file located in the home directory (should be compatible with the restfully one). It can be created with the following:
When accessing the API from outside of Grid’5000 (e.g your local workstation), you need to specify the following configuration file:
echo ' username: MYLOGIN password: MYPASSWORD ' > ~/.python-grid5000.yaml
When accessing the API from a Grid’5000 frontend, providing the username and password is optionnal. Authentication should work out-of-the-box; if it fails, try updating python-grid5000, or specify the path to the certificate to use:
echo ' verify_ssl: /etc/ssl/certs/ca-certificates.crt ' > ~/.python-grid5000.yaml
Using a virtualenv is recommended (python 3.5+ is required)
virtualenv -p python3 venv source venv/bin/activate pip install python-grid5000
5.1 Grid’5000 shell
If you call grid5000 on the command line you should land in a ipython shell. Before starting, the file $HOME/.python-grid5000.yaml will be loaded.
$) grid5000 Python 3.6.5 (default, Jun 17 2018, 21:32:15) Type 'copyright', 'credits' or 'license' for more information IPython 7.3.0 -- An enhanced Interactive Python. Type '?' for help. In [1]: gk.sites.list() Out[1]: [<Site uid:grenoble>, <Site uid:lille>, <Site uid:luxembourg>, <Site uid:lyon>, <Site uid:nancy>, <Site uid:nantes>, <Site uid:rennes>, <Site uid:sophia>] In [2]: # gk is your entry point
5.2 Reference API
5.2.1 Get node information
import logging
import os
from grid5000 import Grid5000
logging.basicConfig(level=logging.DEBUG)
conf_file = os.path.join(os.environ.get("HOME"), ".python-grid5000.yaml")
gk = Grid5000.from_yaml(conf_file)
node_info = gk.sites["nancy"].clusters["grisou"].nodes["grisou-1"]
print("grisou-1 has {threads} threads and has {ram} bytes of RAM".format(
threads=node_info.architecture["nb_threads"],
ram=node_info.main_memory["ram_size"]))
5.2.2 Get Versions of resources
import logging
import os
from grid5000 import Grid5000
logging.basicConfig(level=logging.DEBUG)
conf_file = os.path.join(os.environ.get("HOME"), ".python-grid5000.yaml")
gk = Grid5000.from_yaml(conf_file)
root_versions = gk.root.get().versions.list()
print(root_versions)
rennes = gk.sites["rennes"]
site_versions = rennes.versions.list()
print(site_versions)
cluster = rennes.clusters["paravance"]
cluster_versions = cluster.versions.list()
print(cluster_versions)
node_versions = cluster.nodes["paravance-1"]
print(node_versions)
5.2.3 Browse the reference API offline
Note that only GET like requests are accepted on the ref API.
import logging
import json
from pathlib import Path
import os
from grid5000 import Grid5000, Grid5000Offline
logging.basicConfig(level=logging.DEBUG)
# First get a copy of the reference api
# This is a one time and out-of-band process,
# here we get it by issuing a regular HTTP request
conf_file = os.path.join(os.environ.get("HOME"), ".python-grid5000.yaml")
gk = Grid5000.from_yaml(conf_file)
data = gk.dump_ref_api()
Path("ref.yaml").write_text(json.dumps(data))
# you can dump the data to a file
# and reuse it offline using the dedicated client
# here we reuse directly the data we got (no more HTTP requests will be issued)
ref = Grid5000Offline(json.loads(Path("ref.yaml").read_text()))
print(ref.sites["rennes"].clusters["paravance"].nodes["paravance-1"])
5.3 Monitoring API
5.3.1 Get Statuses of resources
import logging
import os
from grid5000 import Grid5000
logging.basicConfig(level=logging.DEBUG)
conf_file = os.path.join(os.environ.get("HOME"), ".python-grid5000.yaml")
gk = Grid5000.from_yaml(conf_file)
rennes = gk.sites["rennes"]
site_statuses = rennes.status.list()
print(site_statuses)
cluster = rennes.clusters["paravance"]
cluster_statuses = cluster.status.list()
5.4 Job API
5.4.1 Job filtering
import logging
import os
from grid5000 import Grid5000
logging.basicConfig(level=logging.DEBUG)
conf_file = os.path.join(os.environ.get("HOME"), ".python-grid5000.yaml")
gk = Grid5000.from_yaml(conf_file)
# state=running will be placed in the query params
running_jobs = gk.sites["rennes"].jobs.list(state="running")
print(running_jobs)
# get a specific job by its uid
job = gk.sites["rennes"].jobs.get("424242")
print(job)
# or using the bracket notation
job = gk.sites["rennes"].jobs["424242"]
print(job)
5.4.2 Submit a job
import logging
import os
import time
from grid5000 import Grid5000
logging.basicConfig(level=logging.DEBUG)
conf_file = os.path.join(os.environ.get("HOME"), ".python-grid5000.yaml")
gk = Grid5000.from_yaml(conf_file)
# This is equivalent to gk.sites.get("rennes")
site = gk.sites["rennes"]
job = site.jobs.create({"name": "pyg5k",
"command": "sleep 3600"})
while job.state != "running":
job.refresh()
print("Waiting for the job [%s] to be running" % job.uid)
time.sleep(10)
print(job)
print("Assigned nodes : %s" % job.assigned_nodes)
5.5 Deployment API
5.5.1 Deploy an environment
import logging
import os
import time
from grid5000 import Grid5000
logging.basicConfig(level=logging.DEBUG)
conf_file = os.path.join(os.environ.get("HOME"), ".python-grid5000.yaml")
gk = Grid5000.from_yaml(conf_file)
# This is equivalent to gk.sites.get("rennes")
site = gk.sites["rennes"]
job = site.jobs.create({"name": "pyg5k",
"command": "sleep 3600",
"types": ["deploy"]})
while job.state != "running":
job.refresh()
print("Waiting the job [%s] to be running" % job.uid)
time.sleep(10)
print("Assigned nodes : %s" % job.assigned_nodes)
deployment = site.deployments.create({"nodes": job.assigned_nodes,
"environment": "debian9-x64-min"})
# To get SSH access to your nodes you can pass your public key
#
# from pathlib import Path
#
# key_path = Path.home().joinpath(".ssh", "id_rsa.pub")
#
#
# deployment = site.deployments.create({"nodes": job.assigned_nodes,
# "environment": "debian9-x64-min"
# "key": key_path.read_text()})
while deployment.status != "terminated":
deployment.refresh()
print("Waiting for the deployment [%s] to be finished" % deployment.uid)
time.sleep(10)
print(deployment.result)
5.6 Storage API
5.6.1 Get Storage accesses
import logging
import os
from grid5000 import Grid5000
logging.basicConfig(level=logging.DEBUG)
conf_file = os.path.join(os.environ.get("HOME"), ".python-grid5000.yaml")
gk = Grid5000.from_yaml(conf_file)
print(gk.sites["rennes"].storage["home"].access["msimonin"].rules.list())
5.6.2 Set storage accesses (e.g for vms)
from netaddr import IPNetwork
import logging
import os
import time
from grid5000 import Grid5000
logging.basicConfig(level=logging.DEBUG)
conf_file = os.path.join(os.environ.get("HOME"), ".python-grid5000.yaml")
gk = Grid5000.from_yaml(conf_file)
site = gk.sites["rennes"]
job = site.jobs.create({"name": "pyg5k",
"command": "sleep 3600",
"resources": "slash_22=1+nodes=1"})
while job.state != "running":
job.refresh()
print("Waiting the job [%s] to be running" % job.uid)
time.sleep(5)
subnet = job.resources_by_type['subnets'][0]
ip_network = [str(ip) for ip in IPNetwork(subnet)]
# create acces for all ips in the subnet
access = site.storage["home"].access["msimonin"].rules.create({"ipv4": ip_network,
"termination": {
"job": job.uid,
"site": site.uid}})
# listing the accesses
print(gk.sites["rennes"].storage["home"].access["msimonin"].rules.list())
5.7 Vlan API
5.7.1 Get vlan(s)
import logging
import os
from grid5000 import Grid5000
logging.basicConfig(level=logging.DEBUG)
conf_file = os.path.join(os.environ.get("HOME"), ".python-grid5000.yaml")
gk = Grid5000.from_yaml(conf_file)
site = gk.sites["rennes"]
# Get all vlans
vlans = site.vlans.list()
print(vlans)
# Get on specific
vlan = site.vlans.get("4")
print(vlan)
vlan = site.vlans["4"]
print(vlan)
# Get vlan of some nodes
print(site.vlansnodes.submit(["paravance-1.rennes.grid5000.fr", "paravance-2.rennes.grid5000.fr"]))
# Get nodes in vlan
print(site.vlans["4"].nodes.list())
5.7.2 Set nodes in vlan
Putting primary interface in a vlan
import logging import os import time from grid5000 import Grid5000 logging.basicConfig(level=logging.DEBUG) conf_file = os.path.join(os.environ.get("HOME"), ".python-grid5000.yaml") gk = Grid5000.from_yaml(conf_file) site = gk.sites["rennes"] job = site.jobs.create({"name": "pyg5k", "command": "sleep 3600", "resources": "{type='kavlan'}/vlan=1+nodes=1", "types": ["deploy"]}) while job.state != "running": job.refresh() print("Waiting the job [%s] to be runnning" % job.uid) time.sleep(5) deployment = site.deployments.create({"nodes": job.assigned_nodes, "environment": "debian9-x64-min", "vlan": job.resources_by_type["vlans"][0]}) while deployment.status != "terminated": deployment.refresh() print("Waiting for the deployment [%s] to be finished" % deployment.uid) time.sleep(10) print(deployment.result)
Putting the secondary interface in a vlan
import logging import os import time from grid5000 import Grid5000 logging.basicConfig(level=logging.DEBUG) def _to_network_address(host, interface): """Translate a host to a network address e.g: paranoia-20.rennes.grid5000.fr -> paranoia-20-eth2.rennes.grid5000.fr """ splitted = host.split('.') splitted[0] = splitted[0] + "-" + interface return ".".join(splitted) conf_file = os.path.join(os.environ.get("HOME"), ".python-grid5000.yaml") gk = Grid5000.from_yaml(conf_file) site = gk.sites["rennes"] job = site.jobs.create({"name": "pyg5k", "command": "sleep 3600", "resources": "{type='kavlan'}/vlan=1+{cluster='paranoia'}nodes=1", "types": ["deploy"] }) while job.state != "running": job.refresh() print("Waiting the job [%s] to be runnning" % job.uid) time.sleep(5) vlanid = job.resources_by_type["vlans"][0] # we hard code the interface but this can be discovered in the node info # TODO: write the code here to discover nodes = [_to_network_address(n, "eth2") for n in job.assigned_nodes] print(nodes) # set in vlan site.vlans[vlanid].nodes.submit(nodes)
5.7.3 Finding vlans of a user, users of a vlan
The Vlan API allows to check which vlans are being used by a user using the vlanusers manager. Additionally for each individual vlan it is possible to check whether a user is authorized or not.
import os
from grid5000 import Grid5000
conf_file = os.path.join(os.environ.get("HOME"), ".python-grid5000.yaml")
gk = Grid5000.from_yaml(conf_file)
site = gk.sites["rennes"]
#Get list of users using vlans
users = site.vlansusers.list()
#Get list of vlans a user is using.
user = site.vlansusers['msimonin']
print(user.vlans)
#Get list of users using a specific vlan
users = site.vlans['4'].users.list()
#Check if a user has access to a specific vlan.
user = site.vlans['4'].users['msimonin']
print(user.status)
5.7.4 Vlan Stiching
The stitching API allows users to connect a Grid’5000 global vlan to external vlans connected to other testbeds for experiments involving wide area layer2 networks.
conf_file = os.path.join(os.environ.get("HOME"), ".python-grid5000.yaml")
gk = Grid5000.from_yaml(conf_file)
# List current stitchings
gk.stitcher.list()
# Sitching global vlan 16 to external vlan 1290
gk.stitcher.create({"id":"16", "sdx_vlan_id":"1290"})
# Get a a stitchings information by Grid'5000 global vlan id.
stitching = gk.stitcher.get('16')
# Or
stitching = gk.stitcher['16']
# End stitching
stitching = gk.stitcher.get('16')
stitching.delete()
# Or
gk.stitcher.delete('16')
5.8 Metrics API
5.8.1 Get some timeseries (and plot them)
For this example you’ll need matplotlib, seaborn and pandas.
import logging
import os
from grid5000 import Grid5000
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import time
logging.basicConfig(level=logging.INFO)
conf_file = os.path.join(os.environ.get("HOME"), ".python-grid5000.yaml")
gk = Grid5000.from_yaml(conf_file)
metrics = gk.sites["lyon"].clusters["nova"].metrics
print("--- available metrics")
print(metrics)
print("----- a timeserie")
now = time.time()
# NOTE that you can pass a job_id here
kwargs = {
"nodes": "nova-1,nova-2,nova-3",
"metrics": "wattmetre_power_watt",
"start_time": int(now - 600),
}
metrics = gk.sites["lyon"].metrics.list(**kwargs)
# let's visualize this
df = pd.DataFrame()
for metric in metrics:
timestamp = metric.timestamp
value = metric.value
device_id = metric.device_id
df = pd.concat([df, pd.DataFrame({
"timestamp": [timestamp],
"value": [value],
"device_id": [device_id]
})])
sns.relplot(data=df,
x="timestamp",
y="value",
hue="device_id",
kind="line")
plt.show()
5.9 More snippets
5.9.1 Site of a cluster
import logging
import os
from grid5000 import Grid5000
logging.basicConfig(level=logging.DEBUG)
clusters = ["dahu", "parasilo", "chetemi"]
conf_file = os.path.join(os.environ.get("HOME"), ".python-grid5000.yaml")
gk = Grid5000.from_yaml(conf_file)
sites = gk.sites.list()
matches = []
for site in sites:
candidates = site.clusters.list()
matching = [c.uid for c in candidates if c.uid in clusters]
if len(matching) == 1:
matches.append((site, matching[0]))
clusters.remove(matching[0])
print("We found the following matches %s" % matches)
5.9.2 Get all job with a given name on all the sites
import logging
import os
from grid5000 import Grid5000
logging.basicConfig(level=logging.DEBUG)
NAME = "pyg5k"
conf_file = os.path.join(os.environ.get("HOME"), ".python-grid5000.yaml")
gk = Grid5000.from_yaml(conf_file)
sites = gk.sites.list()
site = gk.sites["rennes"]
sites = [gk.sites["rennes"], gk.sites["nancy"], gk.sites["grenoble"]]
# creates some jobs
jobs = []
for site in sites:
job = site.jobs.create({"name": "pyg5k",
"command": "sleep 3600"})
jobs.append(job)
_jobs = []
for site in sites:
_jobs.append((site.uid, site.jobs.list(name=NAME,
state="waiting,launching,running")))
print("We found %s" % _jobs)
# deleting the jobs
for job in jobs:
job.delete()
5.9.3 Caching API responses
The Grid’5000 reference API is static. In this situation to speed up the requests, one could leverage heavily on caching. Currently python-grid5000 doesn’t do caching out-of the box but defers that to the consuming application. There are many solutions to implement a cache. Amongst them LRU cache (https://docs.python.org/3/library/functools.html#functools.lru_cache) provides an in-memory caching facilities but doesn’t give you control on the cache. The ring library (https://ring-cache.readthedocs.io/en/stable/) is great as it implements different backends for your cache (esp. cross-processes cache) and give you control on the cached object. Enough talking:
import logging
import threading
import os
import diskcache
from grid5000 import Grid5000
import ring
_api_lock = threading.Lock()
# Keep track of the api client
_api_client = None
storage = diskcache.Cache('cachedir')
def get_api_client():
"""Gets the reference to the API cient (singleton)."""
with _api_lock:
global _api_client
if not _api_client:
conf_file = os.path.join(os.environ.get("HOME"),
".python-grid5000.yaml")
_api_client = Grid5000.from_yaml(conf_file)
return _api_client
@ring.disk(storage)
def get_sites_obj():
"""Get all the sites."""
gk = get_api_client()
return gk.sites.list()
@ring.disk(storage)
def get_all_clusters_obj():
"""Get all the clusters."""
sites = get_sites_obj()
clusters = []
for site in sites:
# should we cache the list aswell ?
clusters.extend(site.clusters.list())
return clusters
if __name__ == "__main__":
logging.basicConfig(level=logging.DEBUG)
clusters = get_all_clusters_obj()
print(clusters)
print("Known key in the cache")
print(get_all_clusters_obj.get())
print("Calling again the function is now faster")
clusters = get_all_clusters_obj()
print(clusters)
5.9.4 Using Grid’5000 client certificates
python-grid5000 can also be used as a trusted client with Grid’5000 internal certificate. In this mode users can pass the g5k_user argument to most calls to specify which user the API call should be made as. In cases where g5k_user is not specified API calls will be made as the anonymous user whose access is limited to the Grid’5000 reference API. In this mode python-grid5000 does not store any login information, so g5k_user must be provided explicitly provided on every call that requires one.
import logging
from grid5000 import Grid5000
logging.basicConfig(level=logging.DEBUG)
gk = Grid5000(
uri="https://api-ext.grid5000.fr/stable/",
sslcert="/path/to/ssl/certfile.cert",
sslkey="/path/to/ssl/keyfile.key"
)
gk.sites.list()
job = site.jobs.create({"name": "pyg5k",
"command": "sleep 3600"},
g5k_user = "auser1")
# Since the 'anonymous' user can not inspect jobs the following call will raise exception
# python-grid5000.exceptions.Grid5000AuthenticationError: 401 Unauthorized
job.refresh()
# Both following call work since any user can request info on any jobs.
job.refresh(g5k_user='auser1')
job.refresh(g5k_user='auser2')
# Some operations can only be performed by the jobs creator.
# The following call will raise exception
# pyg5k.exceptions.Grid5000DeleteError: 403 Unauthorized
job.delete(g5k_user='auser2')
# This call works as expected
job.delete(g5k_user='auser1')
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
File details
Details for the file python_grid5000-1.2.4-py3-none-any.whl
.
File metadata
- Download URL: python_grid5000-1.2.4-py3-none-any.whl
- Upload date:
- Size: 33.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.8.0 pkginfo/1.8.3 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.12 tqdm/4.64.0 importlib-metadata/4.8.3 keyring/23.4.1 rfc3986/1.5.0 colorama/0.4.5 CPython/3.6.15
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c1c103d0f7d33463a54b9adcc1ffad53a46197d60b91f1c4f3b7924ffd0be111 |
|
MD5 | 7194f8b86288c21956713149eb9a60c7 |
|
BLAKE2b-256 | 8ec83d2313c836d4bdace1ecaa475fe2e8137cd7248f8f8325645943cb358432 |