Skip to main content

Fabric Ceph Service

Project description

FABRIC Ceph Manager Service

A lightweight REST service that orchestrates Ceph Dashboard APIs across one or more Ceph clusters. It standardizes common workflows:

  • CephX users: list, create/update (upsert), delete, export keyrings, and synchronize the same keyring across clusters.

  • CephFS subvolumes: create subvolume groups, create/resize subvolumes, get paths, existence checks, and delete — applied consistently across clusters.

  • Multi-cluster logic:

    • First success” discovery APIs (e.g., list users, export keyrings, get subvolume info).
    • Ensure/sync” APIs (create/update on a source cluster, then fan-out imports/operations to other clusters).

Built with Flask/Connexion (OpenAPI-first), requests for Dashboard calls, and paramiko to run ceph auth import over SSH when replicating CephX secrets.


Table of Contents


Features

  • 🔐 CephX user lifecycle via Dashboard REST:

    • GET /cluster/user — list
    • POST /cluster/user — create
    • PUT /cluster/user — update (with upsert in helper)
    • DELETE /cluster/user/{entity} — delete
    • POST /cluster/user/export — export keyrings (JSON or plaintext)
  • 📦 Keyring fan-out: export the keyring from a “source” cluster and SSH import it to all other clusters (ceph auth import) so the same secret is valid everywhere.

  • 🗂 CephFS subvolumes:

    • PUT /cephfs/subvolume/{vol_name} — create or resize subvolume
    • GET /cephfs/subvolume/{vol_name}/info — info/getpath
    • GET /cephfs/subvolume/{vol_name}/exists — existence probe
    • DELETE /cephfs/subvolume/{vol_name} — delete
  • 🧠 Cross-cluster helpers (in cluster_helper.py):

    • ensure_user_across_clusters, update_user_across_clusters, delete_user_across_clusters
    • ensure_subvolume_across_clusters, delete_subvolume_across_clusters
    • list_users_first_success, export_users_first_success
  • 🧾 Clear OpenAPI 3.0 contract (openapi.yml) and Connexion handlers.


Architecture

               +-------------------+            +---------------------+
  Client  ---> |  Flask + Connexion|  REST ---> |  DashClient (REST)  |
               |  (OpenAPI-first)  |            |  /api/auth + calls  |
               +-------------------+            +----------+----------+
                         |                                |
                         |                                | Keyring text
                         |                            export_keyring()
                         v                                |
               +-------------------+                      |
               |  cluster_helper   | <-- SSH (paramiko) --+
               |  (multi-cluster   |     ceph auth import
               |   orchestration)  |
               +-------------------+
  • DashClient logs into each cluster’s Dashboard (/api/auth) to get a JWT then calls the documented endpoints.
  • For keyring replication, the service uses SSH to run ceph auth import on the target clusters so the secret stays identical.
  • Config-driven multi-cluster: names, endpoints, creds, and SSH parameters per cluster.

OpenAPI

  • Spec: fabric_ceph/openapi.yml

  • Served with Connexion; controller modules:

    • fabric_ceph.openapi_server.controllers.cluster_user_controller
    • fabric_ceph.openapi_server.controllers.ceph_fs_controller
    • fabric_ceph.openapi_server.controllers.version_controller

Notable behaviors

  • X-Cluster header (optional): influence which clusters are tried and in what order, e.g. X-Cluster: europe,us-west
  • application/json everywhere.
  • Errors are normalized using status_4xx / status_5xx schemas.

Code generation

The OpenAPI spec for this service is maintained on SwaggerHub:
https://app.swaggerhub.com/apis/RENCI3/ceph/1.0.0

Use the commands below to validate the spec and generate fresh Flask server stubs.
The helper script will archive the previous scaffold to openapi_server_archive/ and place the newly generated code in openapi_server/ — you can then review and merge changes as needed.

Prereqs: openapi-generator installed (e.g., brew install openapi-generator) and Java available on your PATH.

cd fabric_ceph

# Validate the spec
openapi-generator validate -i openapi.yml

# Generate a Python Flask server scaffold (into a temp folder)
openapi-generator generate -i openapi.yml -g python-flask -o python-flask-server-generated

# Archive the old server code and install the new one
./swagger_code_gen.sh

Configuration

Example config.yml:

cluster:
  europe:
    ceph_cli: ceph
    default_fs: CEPH-FS-01
    dashboard:
      endpoints: [ "https://10.145.126.2:8443" ]
      user: admin
      password: abcd1234
      ssh_user: rocky          # used for importing keyrings
      ssh_key: ~/.ssh/id_rsa_ceph
    rgw_admin:
      endpoints: [ "http://10.145.124.2:8080" ]
      admin_access_key: admin_ak
      admin_secret_key: admin_sk
      ssh_user: rocky
      ssh_key: ~/.ssh/id_rsa_ceph

runtime:
  service_project:
logging:
  log-directory: /var/log/actor
  log-file: actor.log
  metrics-log-file: metrics.log
  log-level: INFO
  log-retain: 5
  log-size: 5000000
  logger: ceph-mgr

oauth:
  jwks-url: https://cm.fabric-testbed.net/credmgr/certs
  key-refresh: 00:10:00
  verify-exp: true

core_api:
  enable: true
  host: https://uis.fabric-testbed.net
  token:

Environment overrides (per cluster, optional):

EUROPE_SSH_HOST, EUROPE_SSH_PORT, EUROPE_SSH_USER, EUROPE_SSH_KEY, EUROPE_SSH_PASSWORD

Set APP_CONFIG_PATH to point the service at your YAML.


Running locally

python -m venv .venv
. .venv/bin/activate
pip install --upgrade pip

# Install service
pip install -r requirements.txt
pip install -e .

# Run
export APP_CONFIG_PATH=/path/to/config.yml
python -m fabric_ceph
# or just: fabric_ceph

By default, the server binds to the host/port your app config or entrypoint sets (commonly 3500 are exposed in Docker; adjust to taste).


Docker

Dockerfile is included. Make sure your build context includes README.md, pyproject.toml, and fabric_ceph/.

Build & run:

# From repo root
docker build -t fabric-ceph .

docker run --rm -p 3500:3500 \
  -e APP_CONFIG_PATH=/etc/fabric/ceph/config/config.yml \
  -v /local/config.yml:/etc/fabric/ceph/config/config.yml:ro \
  fabric-ceph

The container starts cron (optional) and runs the module with the system Python (3.13). Ensure your requirements include connexion, requests, and paramiko.


API usage (curl)

Headers

# (If using bearer auth via Dashboard login)
-H "Authorization: Bearer <jwt>"
# Optional: influence cluster order
-H "X-Cluster: europe,lab"

CephX users

List users:

curl -s http://localhost:3500/cluster/user

Create or update (upsert handled server-side):

curl -s -X POST http://localhost:3500/cluster/user \
  -H "Content-Type: application/json" \
  -d '{
        "user_entity": "client.demo",
        "capabilities": [
          {"entity": "mon", "cap": "allow r"},
          {"entity": "mds", "cap": "allow rw fsname=CEPH-FS-01 path=/volumes/_nogroup/demo"},
          {"entity": "osd", "cap": "allow rw tag cephfs data=CEPH-FS-01"},
          {"entity": "osd", "cap": "allow rw tag cephfs metadata=CEPH-FS-01"}
        ]
      }'

Update caps only:

curl -s -X PUT http://localhost:3500/cluster/user \
  -H "Content-Type: application/json" \
  -d '{"user_entity":"client.demo","capabilities":[{"entity":"mon","cap":"allow r"}]}'

Delete:

curl -s -X DELETE http://localhost:3500/cluster/user/client.demo

Export keyring(s):

curl -s -X POST http://localhost:3500/cluster/user/export \
  -H "Content-Type: application/json" \
  -d '{"entities":["client.demo","client.alice"]}'

Internally, cross-cluster sync helpers can export the keyring from a source and SSH-import it to others, ensuring identical secrets everywhere.

CephFS subvolumes

Create or resize subvolume:

# create with mode and quota
curl -s -X PUT http://localhost:3500/cephfs/subvolume/CEPH-FS-01 \
  -H "Content-Type: application/json" \
  -d '{"subvol_name":"alice","group_name":"fabric_staff","size":10737418240,"mode":"0777"}'

# resize quota
curl -s -X PUT http://localhost:3500/cephfs/subvolume/CEPH-FS-01 \
  -H "Content-Type: application/json" \
  -d '{"subvol_name":"alice","group_name":"fabric_staff","size":536870912}'

Get path/info:

curl -s "http://localhost:3500/cephfs/subvolume/CEPH-FS-01/info?subvol_name=alice&group_name=fabric_staff"

Exists:

curl -s "http://localhost:3500/cephfs/subvolume/CEPH-FS-01/exists?subvol_name=alice&group_name=fabric_staff"

Delete:

curl -s -X DELETE "http://localhost:3500/cephfs/subvolume/CEPH-FS-01?subvol_name=alice&group_name=fabric_staff"

Multi-cluster behavior

  • First-success queries:

    • list_users_first_success returns the first cluster that responds successfully.
    • export_users_first_success exports keyrings from the first cluster that can.
    • get_subvolume_info/subvolume_exists try clusters in X-Cluster order (or config order) and return on first success.
  • Ensure/sync mutations:

    • ensure_user_across_clusters / update_user_across_clusters: pick a source (existing user or preferred cluster), apply caps (create if missing), export keyring, SSH import everywhere else.
    • ensure_subvolume_across_clusters: ensure group, create/resize subvolume on a source, then apply the same to all clusters; returns per-cluster paths.
    • delete_*_across_clusters: best-effort deletes with per-cluster results.

Security notes

  • Dashboard credentials grant cluster-admin control; protect config.yml.
  • SSH key used for ceph auth import should be restricted and rotated.
  • Keyring fan-out intentionally makes the same secret valid across clusters — treat the exported keyring as sensitive data.
  • TLS verification: default behavior depends on your DashClient settings. Prefer HTTPS with CA validation in production.

License

MIT © 2025 FABRIC Testbed


Acknowledgements

  • Ceph Dashboard REST API (for authentication and cluster/user/cephfs management).
  • Thanks to contributors for testing multi-cluster synchronization and API shape validation.

Code gen

Swagger API is maintained here: https://app.swaggerhub.com/apis/RENCI3/ceph/1.0.0

Generate the code using the following commands this should place latest code in openapi_server and old code in openapi_server_archive Merge as needed

cd fabric_ceph
openapi-generator validate -i openapi.yml
openapi-generator generate -i openapi.yml -g python-flask -o python-flask-server-generated
./swagger_code_gen.sh

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fabric_ceph-1.0.0.tar.gz (69.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

fabric_ceph-1.0.0-py3-none-any.whl (96.1 kB view details)

Uploaded Python 3

File details

Details for the file fabric_ceph-1.0.0.tar.gz.

File metadata

  • Download URL: fabric_ceph-1.0.0.tar.gz
  • Upload date:
  • Size: 69.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-requests/2.32.5

File hashes

Hashes for fabric_ceph-1.0.0.tar.gz
Algorithm Hash digest
SHA256 5a16e7dfe6e83eabf5db2d7caa3763e18c28e199bd1fcb24c08c9bcea179dbe0
MD5 c89cdf3ed205f98e319e543353e07afb
BLAKE2b-256 83ae6055c5fa82aaf0d1c9241c38d61f2781ccaa90f1b3f29cc765f95d2af67d

See more details on using hashes here.

File details

Details for the file fabric_ceph-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: fabric_ceph-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 96.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-requests/2.32.5

File hashes

Hashes for fabric_ceph-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 c737976d1caf358c89f7d7821e2059bff3d1a280477eb4a576bff363cd275fca
MD5 32e841ec2e3c9d693f365e66fbfc095c
BLAKE2b-256 8394ea0b00198cc4c1afb6cee8aa63d8f28f411dc9b3edf088def7f7a1634f71

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page