Comprehensive Python framework for managing FABRIC testbed generic clusters and slices
Project description
fabric-generic-cluster
A comprehensive, type-safe Python framework for managing FABRIC testbed slices with support for complex network topologies, DPU interfaces, multi-OS configurations, and various hardware components.
๐ Features
Core Capabilities
- โ Type-Safe Data Models - Pydantic-based topology definitions with automatic validation
- โ DPU Interface Support - Full support for DPU network interfaces alongside traditional NICs
- โ Multi-OS Support - Automatic detection and configuration for Rocky Linux, Ubuntu, and Debian
- โ Hardware Components - Full support for GPUs, FPGAs, DPUs, NVMe, and custom NICs
- โ Network Management - L2/L3 network configuration with IPv4/IPv6 support
- โ SSH Automation - Passwordless SSH setup across all nodes
- โ Visualization - Multiple output formats (text, ASCII, graphs, tables)
- โ
Easy Installation - Available on PyPI via
pip install - โ Modular Design - Separated concerns for better maintainability
Hardware Support
- GPUs - NVIDIA RTX series, Tesla T4, A30, A40
- FPGAs - Xilinx Alveo U280, U50, U250
- DPUs - ConnectX-7 100G/400G Data Processing Units with network interfaces
- NVMe - Intel P4510, P4610 NVMe storage
- NICs - Basic, ConnectX-5, ConnectX-6, SharedNICs, SmartNICs
- Persistent Storage - Volume management
๐ Table of Contents
- Installation
- Quick Start
- Package Structure
- Usage Examples
- API Reference
- Command-Line Tools
- Development
- Documentation
- Contributing
๐ Installation
From PyPI (Recommended)
pip install fabric-generic-cluster
From Source
git clone https://github.com/mcevik0/fabric-generic-cluster.git
cd fabric-generic-cluster
pip install -e .
Prerequisites
- Python 3.9 or higher
- Access to FABRIC testbed
fabrictestbed-extensions>=1.4.0(installed automatically)
Verify Installation
import fabric_generic_cluster
print(fabric_generic_cluster.__version__)
๐ฏ Quick Start
Option 1: Python Script
from fabric_generic_cluster import (
load_topology_from_yaml_file,
deploy_topology_to_fabric,
configure_l3_networks,
configure_node_interfaces,
setup_passwordless_ssh,
)
# Load topology
topology = load_topology_from_yaml_file("topology.yaml")
# Deploy to FABRIC
slice = deploy_topology_to_fabric(topology, "my-cluster")
# Configure networks (if using L3 networks)
configure_l3_networks(slice, topology)
# Configure interfaces
configure_node_interfaces(slice, topology)
# Setup SSH
setup_passwordless_ssh(slice)
print("โ
Cluster deployed and configured!")
Option 2: Using the Example Script
# Clone the repository for examples
git clone https://github.com/mcevik0/fabric-generic-cluster.git
cd fabric-generic-cluster
# Run the complete deployment example
python examples/complete-deployment-example.py \
--yaml path/to/topology.yaml \
--slice-name my-test-slice
Option 3: Jupyter Notebooks
For interactive workflows, check out the fabric-generic-cluster-notebooks repository:
git clone https://github.com/mcevik0/fabric-generic-cluster-notebooks.git
cd fabric-generic-cluster-notebooks
jupyter notebook
๐ฆ Package Structure
fabric-generic-cluster/
โโโ fabric_generic_cluster/ # Main package
โ โโโ __init__.py # Package exports
โ โโโ models.py # Pydantic models for topology
โ โโโ deployment.py # Slice deployment functions
โ โโโ network_config.py # Network configuration
โ โโโ ssh_setup.py # SSH management
โ โโโ topology_viewer.py # Visualization tools
โ โโโ builder_compat.py # Backward compatibility
โ โโโ tools/ # Command-line tools
โ โโโ __init__.py
โ โโโ topology_summary.py # Topology summary generator
โ
โโโ examples/ # Usage examples
โ โโโ complete-deployment-example.py
โ
โโโ tests/ # Test suite
โ โโโ test-dpu-support.py
โ โโโ test-fpga-support.py
โ
โโโ pyproject.toml # Package metadata
โโโ setup.py # Setup configuration
โโโ MANIFEST.in # Package data
โโโ LICENSE # MIT License
โโโ README.md # This file
๐ Usage Examples
Example 1: Load and Explore Topology
from fabric_generic_cluster import (
load_topology_from_yaml_file,
print_topology_summary,
draw_topology_graph,
)
# Load topology
topology = load_topology_from_yaml_file("topology.yaml")
# Print summary
print_topology_summary(topology)
# Create visualization
draw_topology_graph(topology, show_ip=True, save_path="topology.png")
Example 2: Deploy Multi-Site Cluster
from fabric_generic_cluster import (
load_topology_from_yaml_file,
deploy_topology_to_fabric,
configure_node_interfaces,
verify_node_interfaces,
)
# Load topology with nodes at multiple sites
topology = load_topology_from_yaml_file("multi-site-topology.yaml")
# Deploy
slice = deploy_topology_to_fabric(topology, "multi-site-cluster")
# Configure all nodes
configure_node_interfaces(slice, topology)
# Verify configuration
verify_node_interfaces(slice, topology)
Example 3: Access Type-Safe Data
from fabric_generic_cluster import load_topology_from_yaml_file
topology = load_topology_from_yaml_file("topology.yaml")
# Get specific node
node = topology.get_node_by_hostname("node-1")
print(f"Node: {node.hostname}")
print(f"Site: {node.site}")
print(f"CPU: {node.capacity.cpu} cores")
print(f"RAM: {node.capacity.ram} GB")
# Check hardware components
if node.pci.dpu:
print(f"DPUs: {len(node.pci.dpu)}")
for dpu_name, dpu in node.pci.dpu.items():
print(f" - {dpu_name}: {dpu.model}")
print(f" Interfaces: {len(dpu.interfaces)}")
if node.pci.fpga:
print(f"FPGAs: {len(node.pci.fpga)}")
for fpga_name, fpga in node.pci.fpga.items():
print(f" - {fpga_name}: {fpga.model}")
# Get all interfaces (NIC + DPU)
all_interfaces = node.get_all_interfaces()
print(f"\nTotal interfaces: {len(all_interfaces)}")
for device_name, iface_name, iface in all_interfaces:
device_type = "DPU" if device_name.startswith("dpu") else "NIC"
print(f"{device_type} {device_name}.{iface_name}: {iface.binding}")
Example 4: Test Network Connectivity
from fabric_generic_cluster import (
get_slice,
load_topology_from_yaml_file,
ping_network_from_node,
verify_ssh_access,
)
# Get existing slice
slice = get_slice("my-cluster")
topology = load_topology_from_yaml_file("topology.yaml")
# Test ping connectivity
ping_results = ping_network_from_node(
slice,
topology,
source_hostname="node-1",
network_name="network1",
count=3
)
if all(ping_results.values()):
print("โ
All ping tests passed!")
# Test SSH access
ssh_results = verify_ssh_access(
slice,
topology,
source_hostname="node-1",
network_name="network1"
)
if all(ssh_results.values()):
print("โ
All SSH connections successful!")
Example 5: Using Module-Style Imports
For compatibility with existing code:
from fabric_generic_cluster import deployment as sd
from fabric_generic_cluster import network_config as snc
from fabric_generic_cluster import ssh_setup as ssh
from fabric_generic_cluster import load_topology_from_yaml_file
# Load topology
topology = load_topology_from_yaml_file("topology.yaml")
# Deploy
slice = sd.deploy_topology_to_fabric(topology, "my-slice")
# Configure
snc.configure_node_interfaces(slice, topology)
ssh.setup_passwordless_ssh(slice)
๐ง API Reference
Models and Loaders
from fabric_generic_cluster import (
SiteTopology, # Main topology model
Node, # Node model
Network, # Network model
load_topology_from_yaml_file, # Load from YAML file
load_topology_from_dict, # Load from dictionary
)
Deployment Functions
from fabric_generic_cluster import (
deploy_topology_to_fabric, # Deploy slice to FABRIC
configure_l3_networks, # Configure L3 networks
get_slice, # Get existing slice
delete_slice, # Delete slice
check_slices, # List all slices
)
# Usage
slice = deploy_topology_to_fabric(topology, "slice-name")
configure_l3_networks(slice, topology)
Network Configuration
from fabric_generic_cluster import (
configure_node_interfaces, # Configure all interfaces
verify_node_interfaces, # Verify configuration
ping_network_from_node, # Test connectivity
update_hosts_file_on_nodes, # Update /etc/hosts
)
# Usage
configure_node_interfaces(slice, topology)
verify_node_interfaces(slice, topology)
SSH Setup
from fabric_generic_cluster import (
setup_passwordless_ssh, # Complete SSH setup
verify_ssh_access, # Verify SSH connectivity
)
# Usage
setup_passwordless_ssh(slice)
results = verify_ssh_access(slice, topology, "node-1", "network1")
Visualization
from fabric_generic_cluster import (
print_topology_summary, # Detailed summary
print_compact_summary, # Brief summary
draw_topology_graph, # Visual graph
)
# Usage
print_topology_summary(topology)
draw_topology_graph(topology, show_ip=True, save_path="topology.png")
๐ ๏ธ Command-Line Tools
Topology Summary Generator
The package includes a command-line tool for generating topology summaries:
# Generate summary for a YAML file
fabric-topology-summary input.yaml --output output.yaml
# Just print summary without modifying file
fabric-topology-summary input.yaml --dry-run
# Include ASCII diagram
fabric-topology-summary input.yaml --ascii --output output.yaml
This tool is automatically installed when you install the package.
๐ป Development
Setting Up Development Environment
# Clone repository
git clone https://github.com/mcevik0/fabric-generic-cluster.git
cd fabric-generic-cluster
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install in editable mode with dev dependencies
pip install -e ".[dev]"
Running Tests
# Run test suite
pytest tests/
# Run specific test
python tests/test-dpu-support.py
python tests/test-fpga-support.py
Building the Package
# Install build tools
pip install build twine
# Build distribution
python -m build
# Check package
twine check dist/*
# Test upload to TestPyPI
twine upload --repository testpypi dist/*
# Upload to PyPI
twine upload dist/*
Code Style
# Format code
black fabric_generic_cluster/
# Check style
flake8 fabric_generic_cluster/
๐ Documentation
Comprehensive Guides
- Getting Started: See Quick Start above
- Jupyter Notebooks: fabric-generic-cluster-notebooks
- YAML Format: Detailed topology format documentation in notebooks repository
- API Reference: See API Reference above
Example Topologies
Example YAML topology files are available in the notebooks repository:
- Basic 2-node cluster
- Multi-site deployment
- Storage cluster with NVMe
- DPU/SmartNIC configurations
- FPGA-enabled topologies
- OpenStack deployment variants
YAML Topology Format
site_topology:
nodes:
node-1:
hostname: node-1
site: SITE1
capacity:
cpu: 8
ram: 32
disk: 100
os: default_rocky_9
nics:
nic1:
interfaces:
iface1:
binding: network1
ipv4_address: 10.0.1.1
ipv4_netmask: 255.255.255.0
pci:
dpu:
dpu1:
model: NIC_ConnectX_7_100
interfaces:
iface1:
binding: network1
ipv4_address: 10.0.1.10
networks:
network1:
name: network1
type: L2Bridge
subnet: 10.0.1.0/24
๐ค Contributing
Contributions are welcome! Here's how you can help:
- Report bugs: Open an issue on GitHub
- Suggest features: Open an issue with your idea
- Submit PRs: Fork, make changes, and submit a pull request
Contribution Guidelines
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
Development Workflow
- Update code in
fabric_generic_cluster/ - Add tests in
tests/ - Update documentation
- Run tests:
pytest tests/ - Build package:
python -m build - Test locally:
pip install dist/*.whl
๐ Performance
- Validation Speed: ~10ms for typical topology (3-10 nodes)
- Deployment Time: Depends on FABRIC (typically 5-10 minutes)
- Network Config: ~30 seconds per node
- SSH Setup: ~1-2 minutes for 3-node cluster
๐บ๏ธ Roadmap
- Type-safe Pydantic models
- DPU interface support
- Multi-distro support (Rocky/Ubuntu/Debian)
- L2/L3 network configuration
- Automated SSH setup
- PyPI package distribution
- Web-based topology editor
- Ansible playbook integration
- Monitoring and metrics collection
- REST API endpoint
๐ Troubleshooting
Import Issues
Problem: ModuleNotFoundError: No module named 'fabric_generic_cluster'
Solution:
pip install fabric-generic-cluster
YAML File Not Found
Problem: FileNotFoundError when loading topology
Solution: Use absolute paths or ensure YAML file is in current directory:
from pathlib import Path
yaml_file = Path("path/to/topology.yaml")
topology = load_topology_from_yaml_file(str(yaml_file))
DPU Interfaces Not Detected
Problem: DPU interfaces not showing up
Solution: Verify DPU configuration in YAML:
node = topology.get_node_by_hostname("node-1")
print(f"DPUs: {node.pci.dpu}")
# Check all interfaces
all_ifaces = node.get_all_interfaces()
print(f"Total interfaces: {len(all_ifaces)}")
Network Configuration Fails
Problem: Interface configuration errors
Solution:
- Check L3 networks are configured first:
configure_l3_networks(slice, topology) - Ensure nodes are active:
slice.wait() - Verify OS detection: Check logs for supported distro
๐ License
This project is licensed under the MIT License - see the LICENSE file for details.
๐ Acknowledgments
- Built for the FABRIC Testbed
- Uses Pydantic for data validation
- Network visualization with NetworkX and Matplotlib
๐ Support
- ๐ง Issues: GitHub Issues
- ๐ Documentation: fabric-generic-cluster-notebooks
- ๐ฌ FABRIC Slack: FABRIC Workspace
- ๐ FABRIC Help: FABRIC Learn
๐ฆ Related Repositories
- Jupyter Notebooks: fabric-generic-cluster-notebooks - Example notebooks and topology files
- PyPI Package: fabric-generic-cluster - Install via pip
๐ Links
- GitHub: https://github.com/mcevik0/fabric-generic-cluster
- PyPI: https://pypi.org/project/fabric-generic-cluster/
- Documentation: https://github.com/mcevik0/fabric-generic-cluster-notebooks
- FABRIC Testbed: https://fabric-testbed.net/
Made with โค๏ธ for the FABRIC Community
Author: Mert Cevik (@mcevik0)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file fabric_generic_cluster-1.0.15.tar.gz.
File metadata
- Download URL: fabric_generic_cluster-1.0.15.tar.gz
- Upload date:
- Size: 62.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
14170035b6398e622cbb457bf69a3639b1cdb0d5a3175c63f0c88f6ce5a8e95b
|
|
| MD5 |
21b843fd0569e509bc87ff516a8409d8
|
|
| BLAKE2b-256 |
19d9cb366e143fb8710f1556bfe114ca7f078b49f0e659d762081b1f6fe8285c
|
File details
Details for the file fabric_generic_cluster-1.0.15-py3-none-any.whl.
File metadata
- Download URL: fabric_generic_cluster-1.0.15-py3-none-any.whl
- Upload date:
- Size: 60.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1cf7c28ce8e60fd6637ad2dbb063a30ce0d1ce9af8f8fe4e12a79274d64160ff
|
|
| MD5 |
8cd21fb1fcc6d0b04203384676608a04
|
|
| BLAKE2b-256 |
43c313aaae4b2266b1e030538a4c156af610352b830677aca57707338620584d
|