Skip to main content

Swarmchestrate cluster builder

Project description

Swarmchestrate - Cluster Builder

This repository contains the codebase for cluster-builder, which builds K3s clusters for Swarmchestrate using OpenTofu.

Key features:

  • Create: Provisions infrastructure using OpenTofu and installs K3s.
  • Add: Add worker or HA nodes to existing clusters.
  • Remove: Selectively remove nodes from existing clusters.
  • Delete: Destroys the provisioned infrastructure when no longer required.

Prerequisites

Before proceeding, ensure the following prerequisites are installed:

  1. Git: For cloning the repository.
  2. Python: Version 3.9 or higher.
  3. pip: Python package manager.
  4. Make: To run the provided Makefile.
  5. PostgreSQL: OpenTofu stores its state in a Postgres database.
  6. (Optional) Docker: Only needed if you want to run the dev Postgres.
  7. For detailed instructions on edge device requirements, refer to the Edge Device Requirements document.

Getting Started

1. Clone the Repository

To get started, clone this repository:

git clone https://github.com/Swarmchestrate/cluster-builder.git

2. Navigate to the Project Directory

cd cluster-builder

3. Install Dependencies and Tools

Run the Makefile to install all necessary dependencies, including OpenTofu:

 make install

This command will:

  • Install Python dependencies listed in requirements.txt.
  • Download and configure OpenTofu for infrastructure management.
 make db

This command will:

  • Spin up an empty dev Postgres DB (in Docker) for storing state

in ths makefile database details are provide you update or use that ones name pg-db -e POSTGRES_USER=admin -e POSTGRES_PASSWORD=adminpass -e POSTGRES_DB=swarmchestrate

For database setup as a service, refer to the database setup as service document

4. Populate .env file with access config

The .env file is used to store environment variables required by the application. It contains configuration details for connecting to your cloud providers, the PostgreSQL database, and any other necessary resources.

4.1. Rename or copy the example file to .env

cp .env_example .env

4.2. Open the .env file and add the necessary configuration:

You can see all the available variables in .env_example. Key sections include:

  • PostgreSQL connection
  • AWS credentials
  • OpenStack credentials
  • Edge device settings (if applicable)

Basic Usage

Initialisation

from cluster_builder import Swarmchestrate

# Initialise the orchestrator
orchestrator = Swarmchestrate(
    template_dir="/path/to/templates",
    output_dir="/path/to/output"
)

Adding Nodes (Create a New Cluster or Add to Existing Cluster)

The same add_node method is used both for creating a new cluster (with the master node) and for adding worker or high-availability nodes.

1. 1. Prepare the configuration for your node (AWS, OpenStack, or edge):

You may define it directly as a Python dictionary or load it from a separate file. Refer config for details.

2. Load the configuration in Python:

config ={ 
    # your config fields here
}

# Add the node to the cluster (master for new cluster, worker/HA for existing)
cluster_name = orchestrator.add_node(config)
print(f"Created cluster: {cluster_name}")

Notes:

  • For a new cluster, the first node should be the master. The returned cluster outputs (IP, token) should be used when adding subsequent worker or HA nodes.

  • The configuration file defines all required parameters for the node, including cloud provider, K3s role, SSH info, and optional network/security settings.

Removing a Specific Node

To remove a specific node from a cluster:

# Remove a node by its resource name
orchestrator.remove_node(
    cluster_name="your-cluster-name",
    resource_name="eloquent_feynman"  # The resource identifier of the node
)

The remove_node method:

  1. Destroys the node's infrastructure resources
  2. Removes the node's configuration from the cluster

Destroying an Entire Cluster

To completely destroy a cluster and all its nodes:

# Destroy the entire cluster
orchestrator.destroy(
    cluster_name="your-cluster-name"
)

The destroy method:

  1. Destroys all infrastructure resources associated with the cluster
  2. Removes the cluster directory and configuration files

Note for Edge Devices: Since the edge device is already provisioned, the destroy method will not remove K3s directly from the edge device. You will need to manually uninstall K3s from your edge device after the cluster is destroyed.

Deploying Manifests

The deploy_manifests method copies Kubernetes manifests to the target cluster node.

orchestrator.deploy_manifests(
    manifest_folder="path/to/manifests",
    master_ip="MASTER_NODE_IP",
    ssh_key_path="path/to/key.pem",
    ssh_user="USERNAME"
)

DEMO

A set of demo scripts is provided to showcase how to deploy a full multi-cloud K3s cluster (AWS master, OpenStack worker, Edge worker), manage manifests, configure registries, remove nodes, and destroy the cluster.

These scripts walk through the entire lifecycle of a cluster and are the recommended starting point for understanding how the system works end-to-end.

For detailed information on how each demo script works, refer demo scripts documentation


Important Configuration Requirements

High Availability Flag (ha):

Set "ha": true when adding an additional server node to an existing master. Do not set it to true for standalone masters or worker nodes.

Ports:

You can define additional ports via:

"custom_ingress_ports": [...],
"custom_egress_ports": [...]

When a new security group is created, all required K3s and system ports are automatically added, even if you don’t specify them. Your custom rules are added on top of these defaults. Only define ports when you need extra application access.

OpenStack Floating IP:

When provisioning on sztaki openStack, you should provide the value for 'floating_ip_pool' from which floating IPs can be allocated for the instance. If not specified, OpenTofu will not assign floating IP.


Advanced Usage

Dry Run Mode

All operations support a dryrun parameter, which validates the configuration without making changes. A node created with dryrun should be removed with dryrun.

# Validate configuration without deploying
orchestrator.add_node(config, dryrun=True)

# Validate removal without destroying
orchestrator.remove_node(cluster_name, resource_name, dryrun=True)

# Validate destruction without destroying
orchestrator.destroy(cluster_name, dryrun=True)

Custom Cluster Names

By default, cluster names are generated automatically. To specify a custom name:

config = {
    "cluster_name": "production-cluster",
    # ... other configuration ...
}

Template Structure

Templates should be organised as follows:

  • templates/ - Base directory for templates
  • templates/{cloud}/ - Terraform modules for each cloud provider
  • templates/{role}_user_data.sh.tpl - Node initialisation scripts
  • templates/{cloud}_provider.tf - Provider configuration templates

Contact

For any questions or feedback, feel free to reach out:


Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cluster_builder-0.4.1.tar.gz (32.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cluster_builder-0.4.1-py3-none-any.whl (36.3 kB view details)

Uploaded Python 3

File details

Details for the file cluster_builder-0.4.1.tar.gz.

File metadata

  • Download URL: cluster_builder-0.4.1.tar.gz
  • Upload date:
  • Size: 32.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for cluster_builder-0.4.1.tar.gz
Algorithm Hash digest
SHA256 d31ea59ccfb6d8983cc5775667ad10d5ef152d09d84d1b3147e7f38a8a40d266
MD5 45862948596531284b1225ca1c2887a0
BLAKE2b-256 c65afc57249880afb348bbdbc29665815e79bfcb765d43bcfbef58a3c15405a5

See more details on using hashes here.

Provenance

The following attestation bundles were made for cluster_builder-0.4.1.tar.gz:

Publisher: release.yml on Swarmchestrate/cluster-builder

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file cluster_builder-0.4.1-py3-none-any.whl.

File metadata

File hashes

Hashes for cluster_builder-0.4.1-py3-none-any.whl
Algorithm Hash digest
SHA256 a4bac891a40473692819ef186f53c470083399cb071d9aa1be3ced02c2a184bf
MD5 e514ab9cfcd642e72694eac235f367ca
BLAKE2b-256 a300371e8e1c81f04e4b3b5297bd1d83acbb283fc2a75f5b1a5f13ced6682cc0

See more details on using hashes here.

Provenance

The following attestation bundles were made for cluster_builder-0.4.1-py3-none-any.whl:

Publisher: release.yml on Swarmchestrate/cluster-builder

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page