Skip to main content

Swarmchestrate cluster builder

Project description

Swarmchestrate - Cluster Builder

This repository contains the codebase for [cluster-builder], which builds K3s clusters for Swarmchestrate using OpenTofu.

Key features:

  • Create: Provisions infrastructure using OpenTofu and installs K3s.
  • Add: Add worker or HA nodes to existing clusters.
  • Remove: Selectively remove nodes from existing clusters.
  • Delete: Destroys the provisioned infrastructure when no longer required.

Prerequisites

Before proceeding, ensure the following prerequisites are installed:

  1. Git: For cloning the repository.
  2. Python: Version 3.9 or higher.
  3. pip: Python package manager.
  4. OpenTofu: Version 1.6 or higher for infrastructure provisioning.
  5. Make: To run the provided Makefile.
  6. PostgreSQL: For storing OpenTofu state.
  7. (Optional) Docker: To create a dev Postgres

Getting Started

1. Clone the Repository

To get started, clone this repository:

git clone https://github.com/Swarmchestrate/cluster-builder.git

2. Navigate to the Project Directory

cd cluster-builder

3. Install Dependencies and Tools

Run the Makefile to install all necessary dependencies, including OpenTofu:

 make install

This command will:

  • Install Python dependencies listed in requirements.txt.
  • Download and configure OpenTofu for infrastructure management.

Optional

 make db

This command will:

  • Spin up an empty dev Postgres DB (in Docker) for storing state

4. Populate .env file with access config

First, rename or copy the example file to .env

cp .env_example .env

Then populate postgres connection details and needed cloud credential data.

## PG Configuration
POSTGRES_USER=postgres
POSTGRES_PASSWORD=secret
POSTGRES_HOST=db.example.com
POSTGRES_DATABASE=terraform_state
POSTGRES_SSLMODE=prefer

## AWS Auth
AWS_REGION=us-west-2
AWS_ACCESS_KEY=AKIAXXXXXXXXXXXXXXXX
AWS_SECRET_KEY=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

Basic Usage

Initialisation

from cluster_builder import Swarmchestrate

# Initialise the orchestrator
orchestrator = Swarmchestrate(
    template_dir="/path/to/templates",
    output_dir="/path/to/output"
)

Creating a New Cluster

To create a new k3s cluster, use the add_node method with the master role:

# Configuration for a new cluster
config = {
    "cloud": "aws",
    "k3s_role": "master",
    "ami": "ami-0123456789abcdef",
    "instance_type": "t3.medium",
    "ssh_key_name": "your-ssh-key",
    "k3s_token": "your-k3s-token"
}

# Create the cluster (returns the cluster name)
cluster_name = orchestrator.add_node(config)
print(f"Created cluster: {cluster_name}")

Adding Nodes to an Existing Cluster

To add worker or high-availability nodes to an existing cluster:

# Configuration for adding a worker node
worker_config = {
    "cloud": "aws",
    "k3s_role": "worker",  # can be "worker" or "ha"
    "master_ip": "1.2.3.4",  # IP of the master node
    "cluster_name": "existing-cluster-name",  # specify an existing cluster
    "ami": "ami-0123456789abcdef",
    "instance_type": "t2.medium",
    "ssh_key_name": "your-ssh-key",
    "k3s_token": "k3s-cluster-token" # Token of existing cluster
}

# Add the worker node
cluster_name = orchestrator.add_node(worker_config)
print(f"Added worker node to cluster: {cluster_name}")

Important requirements:

  • For k3s_role="worker" or k3s_role="ha", you must specify a master_ip
  • For k3s_role="master", you must not specify a master_ip

Removing a Specific Node

To remove a specific node from a cluster:

# Remove a node by its resource name
orchestrator.remove_node(
    cluster_name="your-cluster-name",
    resource_name="aws_eloquent_feynman"  # The resource identifier of the node
)

The remove_node method:

  1. Destroys the node's infrastructure resources
  2. Removes the node's configuration from the cluster

Destroying an Entire Cluster

To completely destroy a cluster and all its nodes:

# Destroy the entire cluster
orchestrator.destroy(
    cluster_name="your-cluster-name"
)

The destroy method:

  1. Destroys all infrastructure resources associated with the cluster
  2. Removes the cluster directory and configuration files

Advanced Usage

Dry Run Mode

All operations support a dryrun parameter, which validates the configuration without making changes. A node created with dryrun should be removed with dryrun.

# Validate configuration without deploying
orchestrator.add_node(config, dryrun=True)

# Validate removal without destroying
orchestrator.remove_node(cluster_name, resource_name, dryrun=True)

# Validate destruction without destroying
orchestrator.destroy(cluster_name, dryrun=True)

Custom Cluster Names

By default, cluster names are generated automatically. To specify a custom name:

config = {
    "cloud": "aws",
    "k3s_role": "master",
    "cluster_name": "production-cluster",
    # ... other configuration ...
}

orchestrator.add_node(config)

Template Structure

Templates should be organised as follows:

  • templates/ - Base directory for templates
  • templates/{cloud}/ - Terraform modules for each cloud provider
  • templates/{role}_user_data.sh.tpl - Node initialisation scripts
  • templates/{cloud}_provider.tf.j2 - Provider configuration templates

Edge Device Requirements

To connect edge devices as part of your K3s cluster, ensure that the following ports are open on each edge device to enable communication within nodes:

Inbound Rules:

Port Range Protocol Purpose
2379-2380 TCP Internal servers communication for embedded etcd
6443 TCP K3s API server communication
8472 UDP Flannel VXLAN (network overlay)
10250 TCP Kubelet metrics and communication
51820 UDP WireGuard IPv4 (for encrypted networking)
51821 UDP WireGuard IPv6 (for encrypted networking)
5001 TCP Embedded registry (Spegel)
22 TCP SSH access for provisioning and management
80 TCP HTTP communication for web access
443 TCP HTTPS communication for secure access
53 UDP DNS (CoreDNS) for internal service discovery
5432 TCP PostgreSQL database access

Outbound Rule:

Port Range Protocol Purpose
all all Allow all outbound traffic for the system's operations

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cluster_builder-0.1.1.tar.gz (12.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cluster_builder-0.1.1-py3-none-any.whl (11.2 kB view details)

Uploaded Python 3

File details

Details for the file cluster_builder-0.1.1.tar.gz.

File metadata

  • Download URL: cluster_builder-0.1.1.tar.gz
  • Upload date:
  • Size: 12.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for cluster_builder-0.1.1.tar.gz
Algorithm Hash digest
SHA256 a2d59fc23a88e9c56274ec8e31b0f2898864ebb2eaadb171f0ca87fbc8ec6665
MD5 07f17f2e6331e4c69fc1fe37d475d583
BLAKE2b-256 055f68729a689dacec3452395c3f64cacfc694975a4e110e74e440add8072f4f

See more details on using hashes here.

Provenance

The following attestation bundles were made for cluster_builder-0.1.1.tar.gz:

Publisher: release.yml on Swarmchestrate/cluster-builder

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file cluster_builder-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for cluster_builder-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 f7f9a7d60eb14ba822a8726a5bd869cd663e8ad68e8add798c840be44130c6ce
MD5 1371244e41dd09e403e7e376a77502fd
BLAKE2b-256 60396bbc4f2a273974d73e1c48f918f64d92fa64e1fd68c998dc18d141750af8

See more details on using hashes here.

Provenance

The following attestation bundles were made for cluster_builder-0.1.1-py3-none-any.whl:

Publisher: release.yml on Swarmchestrate/cluster-builder

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page