DWE CLI - Data Warehouse Ecosystem Orchestrator

These details have not been verified by PyPI

Project description

dwe-core

The DWE CLI (dwe) is the orchestration brain of the Data Warehouse Ecosystem. It takes a blank or existing client Git repository and injects a fully working Adapter — infrastructure, application config, CI/CD pipelines, and local dev commands — in a single command.

How it works

dwe create-service test_adapter --git-repo https://github.com/client/repo --envs dev --envs prod

Internally this does:

1. Clone        GitPython clones the client repo to a temp directory
2. Hydrate      Copier renders the adapter template into the clone
3. State        CLI writes dwe-state.json
4. CI/CD        CLI renders per-environment GitHub Actions / GitLab CI files
5. Branch       initial-commit branch is created and committed
6. Env branches dev, prod branches are created from initial-commit
7. Push         All branches are pushed to the remote
8. Secrets      GitHub/GitLab API uploads secrets to the repository settings

The result is a client repo that already has working infrastructure code, a justfile with just up / just deploy-prod, and CI/CD that deploys to the right environment when you push to its branch.

Installation

pip install poetry        # if not already installed
poetry install            # from dwe-core source (creates venv, installs deps)
# or once published:
pip install dwe-core

Verify:

dwe --help
dwe list-adapters

Commands

`dwe create-service`

dwe create-service <adapter_name> \
  --git-repo <url> \
  [--envs <name>]...       \   # default: development, main
  [--secrets <json>]       \   # e.g. '{"AWS_KEY":"abc"}'
  [--tag <version>]        \   # adapter git tag, e.g. v1.2.0
  [--token <api-token>]    \   # or set GITHUB_TOKEN / GITLAB_TOKEN
  [--aws-region <region>]  \
  [--instance-type <type>] \
  [--clone-dir <path>]         # default: temp dir

Example — full run:

export GITHUB_TOKEN=ghp_xxxx

dwe create-service test_adapter \
  --git-repo https://github.com/acme/data-platform \
  --envs development \
  --envs staging \
  --envs main \
  --secrets '{"PULUMI_ACCESS_TOKEN":"pul-xxx","AWS_ACCESS_KEY_ID":"AKI...","AWS_SECRET_ACCESS_KEY":"..."}' \
  --tag v1.0.0 \
  --aws-region eu-west-1 \
  --instance-type t3.small

After this runs, the data-platform repo has:

.github/workflows/
  deploy-development.yaml
  deploy-staging.yaml
  deploy-main.yaml
blueprint/
  html/index.html
  instance-setup.sh
docker-compose.yml
docker-compose.prod.yml
.env.example
justfile
infrastructure/
  __main__.py          <- project_name, instance_type already substituted
  Pulumi.yaml
  requirements.txt
dwe-state.json
.copier-answers.yml    <- Copier's internal state (enables future updates)

`dwe update-service`

dwe update-service <adapter_name> <local_path> [--tag <version>]

Example:

dwe update-service test_adapter ./data-platform --tag v1.2.0

Internally:

Reads dwe-state.json and validates the adapter name matches
Creates a branch dwe-update-20260322-1.2.0
Runs copier.run_update() — smart merge that preserves your customisations
Updates dwe-state.json with the new version

Review the diff on the branch, then merge into your environment branches to trigger deployments.

`dwe list-adapters`

dwe list-adapters

Shows all adapters registered in adapters.json.

Adapter Registry (`adapters.json`)

{
  "test_adapter": {
    "path": "/absolute/path/to/dwe_test_adapter",
    "type": "local",
    "description": "Test adapter: AWS EC2 instance via Pulumi"
  },
  "superset_adapter": {
    "url": "https://github.com/hipposys/dwe-superset-adapter",
    "type": "git",
    "description": "Apache Superset on ECS"
  }
}

How to Define a New Adapter

An adapter is a real, runnable project that also serves as a Copier template. The guiding principle:

The adapter must work locally as-is. A developer should be able to git clone the adapter, run just up, and have a working service — without running the DWE CLI at all.

Step 1: Create the adapter repository

mkdir my_adapter && cd my_adapter
git init

Step 2: Build a working application first

Build your service as a real project before adding any template variables. For example, if you're building a Superset adapter:

# Make it work locally first
docker compose up    # verify it runs

Only once everything works locally do you introduce {{ variables }}.

Step 3: Directory structure

my_adapter/
├── copier.yml                  # Copier config + question definitions
│
├── docker-compose.yml          # Real, runnable. Uses ${ENV_VAR:-default} for runtime values.
├── docker-compose.prod.yml     # Production overrides (restart policy, logging)
├── .env.example                # Template for secrets — committed; .env is git-ignored
├── .gitignore
│
├── justfile                    # Dev commands (just up, just deploy-prod, just infra-up)
│
├── blueprint/                  # Application-level config files
│   ├── html/                   # or nginx.conf, superset_config.py, etc.
│   └── instance-setup.sh       # EC2 user-data bootstrap script
│
├── infrastructure/             # Pulumi IaC — only files here use .jinja
│   ├── __main__.py.jinja       # <- .jinja because it embeds {{ project_name }}
│   ├── Pulumi.yaml.jinja       # <- .jinja because it embeds {{ project_name }}
│   └── requirements.txt
│
└── ci-templates/               # Jinja2 templates rendered by the CLI (not Copier)
    └── deploy.yaml             # Uses {{ ENV_NAME }}, {{ AWS_REGION }}

Step 4: Write `copier.yml`

copier.yml controls how Copier processes the adapter. Key settings:

_templates_suffix: .jinja    # ONLY files ending in .jinja are treated as templates
                              # Everything else is copied verbatim

_exclude:
  - copier.yml               # Don't copy Copier's own config
  - ci-templates             # CLI handles this separately
  - README.md                # Adapter's README is not for client repos
  - .git
  - .env                     # Never copy actual secrets
  - __pycache__
  - "*.pyc"

_skip_if_exists:
  - .env.example             # Preserve user customisations on updates

# Questions (answered non-interactively by the dwe CLI):
project_name:
  type: str
  help: "Client project name (used for cloud resource naming)"

adapter_name:
  type: str
  default: "my_adapter"
  when: false    # always set programmatically

adapter_version:
  type: str
  default: "v1.0.0"
  when: false    # always set programmatically

environments:
  type: yaml
  default: "[development, main]"

aws_region:
  type: str
  default: "us-east-1"

Step 5: Decide what needs Jinja2

Apply this rule: if the value changes per client, use {{ variable }}. If it changes per deployment environment, use a .env variable.

File	Approach	Reason
`docker-compose.yml`	`.env` interpolation (`${VAR:-default}`)	Works locally without any substitution; runtime config
`infrastructure/__main__.py`	Jinja2 (`.jinja` extension)	Cloud resource names must be unique per client at provision time
`infrastructure/Pulumi.yaml`	Jinja2 (`.jinja` extension)	Stack name must be unique per client
`justfile`	Verbatim copy (no `.jinja`)	Commands are identical across clients
`blueprint/instance-setup.sh`	Verbatim copy	Generic bootstrap, no client-specific values
`.env.example`	Verbatim copy	Users fill in real values after cloning

Jinja2 syntax in .jinja files:

# infrastructure/__main__.py.jinja
instance = aws.ec2.Instance(
    "{{ project_name }}-instance",          # <- substituted by Copier
    instance_type="{{ instance_type }}",
    ...
)

After dwe create-service this becomes:

instance = aws.ec2.Instance(
    "acme-data-platform-instance",
    instance_type="t3.small",
    ...
)

Step 6: Write `ci-templates/deploy.yaml`

This is a Jinja2 file rendered by the dwe CLI (not by Copier) to generate one workflow file per environment. The CLI uses {@ @} as variable delimiters (not {{ }}), so GitHub Actions ${{ secrets.X }} syntax passes through untouched — no escaping needed.

name: Deploy to {@ ENV_NAME @}

on:
  push:
    branches:
      - {@ ENV_NAME @}
  pull_request:
    branches:
      - {@ ENV_NAME @}

jobs:
  deploy:
    runs-on: ubuntu-latest
    environment: {@ ENV_NAME @}
    steps:
      - uses: actions/checkout@v4
      - name: Deploy
        run: just deploy-prod
        env:
          AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}    # passes through unchanged
          AWS_REGION: {@ AWS_REGION @}                           # substituted by dwe CLI

Available variables: {@ ENV_NAME @}, {@ AWS_REGION @}.

Step 7: Register the adapter

Add an entry to dwe-core/adapters.json:

Local (development):

{
  "my_adapter": {
    "path": "/absolute/path/to/my_adapter",
    "type": "local",
    "description": "My adapter description"
  }
}

Remote Git (production):

{
  "my_adapter": {
    "url": "https://github.com/your-org/my-adapter",
    "type": "git",
    "description": "My adapter description"
  }
}

Step 8: Test the adapter

Test locally first (without DWE CLI):

cd my_adapter
cp .env.example .env
just up                    # docker compose up — must work here

Test Copier rendering in isolation:

pip install copier
copier copy /path/to/my_adapter /tmp/test-output \
  --data project_name=testproject \
  --data aws_region=us-east-1 \
  --defaults --overwrite --trust

# Inspect the output
ls /tmp/test-output
cat /tmp/test-output/infrastructure/Pulumi.yaml    # should have project_name substituted
cat /tmp/test-output/docker-compose.yml            # should be identical to source
cd /tmp/test-output && docker compose up           # should still work

Test via dwe CLI:

dwe create-service my_adapter \
  --git-repo https://github.com/test-org/empty-repo \
  --envs development \
  --envs main

Adapter Versioning and Updates

Tag your adapter repository with semantic version tags. The DWE CLI and Copier use these tags for update-service:

cd my_adapter
git add -A && git commit -m "feat: add postgres service"
git tag v1.1.0
git push origin v1.1.0

When a client wants to update:

dwe update-service my_adapter ./client-repo --tag v1.1.0

Copier reads the source URL from .copier-answers.yml in the client repo, checks out v1.1.0, and runs a 3-way merge. Files the user has customised are preserved where possible; conflicts surface as standard git merge conflicts.

What gets updated:

infrastructure/ — Pulumi code (Jinja2 re-rendered with new template)
blueprint/ — Application config files
justfile — Dev commands

What is NOT updated (protected):

.env.example — skipped if it already exists (_skip_if_exists in copier.yml)
.copier-answers.yml — managed by Copier internally

State Files

`dwe-state.json` (DWE-managed)

Written by the dwe CLI after copier.run_copy(). Tracks DWE-specific metadata:

{
  "dwe_version": "1.0.0",
  "adapter": {
    "name": "test_adapter",
    "version": "v1.0.0",
    "last_update": "2026-03-22"
  },
  "environments": ["development", "main"],
  "infrastructure": "pulumi"
}

`.copier-answers.yml` (Copier-managed)

Written by Copier. Tracks the template source, version, and question answers. Do not edit manually. This is what enables copier.run_update() to know where the template came from.

# Changes here will be overwritten by copier
_commit: v1.0.0
_src_path: /path/to/my_adapter
project_name: acme-data-platform
aws_region: eu-west-1
instance_type: t3.small

Both files coexist. dwe-state.json is for DWE tooling; .copier-answers.yml is for Copier's update machinery.

Developer Workflow After `create-service`

Once the client repo is hydrated, the full developer loop is:

1. Local development (laptop):

git clone https://github.com/client/data-platform
cd data-platform
cp .env.example .env      # fill in local values (no real AWS keys needed)
just up                   # docker compose up — app is running at localhost:8080

2. Provision cloud infrastructure (once):

# Fill in real AWS keys in .env
just install-infra         # pip install pulumi pulumi-aws
just infra-preview         # see what Pulumi will create
just infra-up              # provision the EC2 instance

3. Deploy to EC2 (SSH into the instance, then):

git clone https://github.com/client/data-platform /srv/app
cd /srv/app
cp .env.example .env       # fill in production values
just deploy-prod           # docker compose -f ... up -d

4. CI/CD (automatic after push):

Pushing to development or main triggers the corresponding GitHub Actions workflow. See the CI/CD Workflow Design section below for the full two-path logic.

CI/CD Workflow Design

The generated CI/CD workflow (.github/workflows/deploy-{env}.yaml) implements a two-path logic inspired by the Superset production setup. The key insight: infrastructure changes and application changes require completely different responses.

The Two Paths

Push to branch
       │
       ▼
  Detect changes
  (dorny/paths-filter)
       │
       ├─── infrastructure/** changed?
       │         │
       │         ├─ Pull Request → pulumi preview  (validate, no apply)
       │         └─ Push        → pulumi up --yes  (apply infra changes)
       │
       └─── docker-compose / blueprint changed?
                 AND infrastructure NOT changed?
                         │
                         └─ Push → SSM: git pull + just deploy-prod
                                   (redeploy app on the live EC2 instance)

Why skip deploy when infra also changed? The pulumi up step re-provisions the EC2 instance itself, which already pulls the latest code via its user-data script. Running the app deploy on top of that would be redundant and potentially racy.

Job Summary

Job	Trigger	What it does
`pulumi-preview`	PR, `infrastructure/**` changed	Runs `pulumi preview` — shows what would change, no side effects
`pulumi-apply`	Push, `infrastructure/**` changed	Runs `pulumi up --yes` — applies infra changes
`deploy-app`	Push, app files changed, infra NOT changed	AWS SSM command: `git pull && just deploy-prod` on live EC2

Required Secrets

Set these via dwe create-service --secrets '{...}' or manually in GitHub repository settings:

Secret	Description
`AWS_ACCESS_KEY_ID`	AWS credentials for Pulumi and SSM
`AWS_SECRET_ACCESS_KEY`	AWS credentials
`PULUMI_ACCESS_TOKEN`	Pulumi Cloud token
`PULUMI_CONFIG_PASSPHRASE`	Pulumi stack encryption passphrase
`PULUMI_STACK`	Pulumi stack reference, e.g. `myorg/myproject/development`
`EC2_INSTANCE_ID`	Instance ID from `pulumi stack output instance_id`, e.g. `i-0abc1234`

SSM Prerequisites

The deploy-app job uses AWS Systems Manager (SSM) instead of SSH — no port 22, no SSH key stored as a secret.

To enable SSM on the EC2 instance:

1. IAM instance profile — attach a role with these policies to the EC2:

{
  "Effect": "Allow",
  "Action": [
    "ssm:UpdateInstanceInformation",
    "ssmmessages:CreateControlChannel",
    "ssmmessages:OpenControlChannel",
    "ec2messages:GetMessages",
    "ec2messages:SendReply"
  ],
  "Resource": "*"
}

Or simply attach the AWS managed policy AmazonSSMManagedInstanceCore.

2. SSM agent — Amazon Linux 2023 ships with it pre-installed. The blueprint/instance-setup.sh bootstrap script ensures it's running:

systemctl enable amazon-ssm-agent
systemctl start amazon-ssm-agent

3. Store the instance ID — after running just infra-up, get the instance ID and store it as a secret:

cd infrastructure && pulumi stack output instance_id
# → i-0abc1234567890def
# Add this to GitHub repository secrets as EC2_INSTANCE_ID

Example: What Happens on a Typical Push

Scenario 1 — you edited blueprint/html/index.html:

Push to development branch
  ↓
detect-changes: infrastructure=false, app=true
  ↓
deploy-app runs:
  aws ssm send-command "git pull && just deploy-prod"
  polls every 10s until success
  prints stdout from EC2 instance
  ↓
New HTML is live ~30 seconds after push

Scenario 2 — you changed infrastructure/__main__.py.jinja (e.g. bigger instance type):

Push to development branch
  ↓
detect-changes: infrastructure=true, app=false
  ↓
pulumi-apply runs:
  pulumi up --yes
  Pulumi modifies the EC2 instance type in-place (or replaces it)
  ↓
Infrastructure updated. New instance pulls latest code via user-data.

Scenario 3 — you opened a PR with Pulumi changes:

Pull Request to development
  ↓
detect-changes: infrastructure=true
  ↓
pulumi-preview runs:
  pulumi preview
  Output shown in CI logs — no changes applied
  ↓
Reviewer can see exactly what Pulumi will do before merging.

Adapting for Other Platforms

The same two-path logic works for GitLab CI. The superset's .gitlab-ci.yml uses:

# Skip deploy if terraform changed
- if: $CI_COMMIT_BRANCH == "main"
  changes:
    - terraform_scalling/**/*
  when: never
# Only deploy if docker/compose changed
- if: $CI_COMMIT_BRANCH == "main"
  changes:
    - docker/**/*
    - docker-compose.yml

For your adapter's GitLab template, mirror this pattern with pulumi instead of terraform and infrastructure/** instead of terraform_scalling/**.

Adding a New Environment Later

Environments are set up at create-service time. To add one later:

# Create the branch
git checkout initial-commit
git checkout -b staging
git push origin staging

# Generate the workflow file
cp .github/workflows/deploy-development.yaml .github/workflows/deploy-staging.yaml
# Edit deploy-staging.yaml: change all occurrences of "development" to "staging"
git add .github/workflows/deploy-staging.yaml
git commit -m "chore: add staging environment"
git push

Releasing to PyPI

Two workflows handle the full release lifecycle:

bump version in pyproject.toml → merge to main
         │
         ▼
  tag-version.yml          triggers on: push to main, pyproject.toml changed
  reads Poetry version      creates git tag vX.Y.Z automatically
         │
         ▼
  (go to GitHub → Releases → Draft a new release → publish it)
         │
         ▼
  pypi-publish.yml          triggers on: release published
  poetry build + publish    pushes to PyPI via PYPI_TOKEN

One-time setup

Add PYPI_TOKEN to the repository secrets (Settings → Secrets → Actions):

Go to https://pypi.org/manage/account/token/ and create an API token scoped to dwe-core
In GitHub: Settings → Secrets and variables → Actions → New repository secret
- Name: PYPI_TOKEN
- Value: the token from PyPI (starts with pypi-)

Release flow

Step 1 — bump the version and merge to main:

poetry version patch        # 1.0.0 → 1.0.1
poetry version minor        # 1.0.0 → 1.1.0
poetry version major        # 1.0.0 → 2.0.0
poetry version prerelease   # 1.0.0 → 1.0.1a1
poetry version 1.2.0        # set explicit version

git add pyproject.toml
git commit -m "chore: bump version to $(poetry version -s)"
git push origin main

tag-version.yml fires on the push, reads the version from pyproject.toml, and pushes tag vX.Y.Z. No manual tagging needed, and it only runs on main.

Step 2 — publish the GitHub Release:

Go to github.com/<org>/dwe-core/releases, click Draft a new release, select the tag just created, and click Publish release.

pypi-publish.yml fires on the publish event: runs poetry install, poetry build, then poetry publish -u __token__ -p $PYPI_TOKEN.

Technical Stack

Concern	Library
CLI framework	Typer
Template engine	Copier
Git operations	GitPython
GitHub secrets	PyGithub
GitLab variables	python-gitlab
Runtime templating	Jinja2 (for CI templates)
Infrastructure	Pulumi
Task runner	Just

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

1.0.0a1 pre-release

Mar 22, 2026

0.1.0a3 pre-release

Apr 23, 2026

This version

0.1.0a1 pre-release

Apr 19, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dwe_core-0.1.0a1.tar.gz (20.0 kB view details)

Uploaded Apr 19, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

dwe_core-0.1.0a1-py3-none-any.whl (15.6 kB view details)

Uploaded Apr 19, 2026 Python 3

File details

Details for the file dwe_core-0.1.0a1.tar.gz.

File metadata

Download URL: dwe_core-0.1.0a1.tar.gz
Upload date: Apr 19, 2026
Size: 20.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/2.3.4 CPython/3.11.15 Linux/6.17.0-1010-azure

File hashes

Hashes for dwe_core-0.1.0a1.tar.gz
Algorithm	Hash digest
SHA256	`59e8aed52c926ab69bb3ea0830325e4f1069d43055ff53c67ba0b5e04f31cbcc`
MD5	`262aa76a4194462fa6660723eee31cd4`
BLAKE2b-256	`a7404e655e70484195ee7d46f01e96ed049d780dd8b90782aca7ba872565c3a1`

See more details on using hashes here.

File details

Details for the file dwe_core-0.1.0a1-py3-none-any.whl.

File metadata

Download URL: dwe_core-0.1.0a1-py3-none-any.whl
Upload date: Apr 19, 2026
Size: 15.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/2.3.4 CPython/3.11.15 Linux/6.17.0-1010-azure

File hashes

Hashes for dwe_core-0.1.0a1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`88f6eb01704393f29a5e4d3ecaf718b74c5a536e8957c204461240d0fa11f5c6`
MD5	`3592bc3c80d5f6e33c48867676e3beca`
BLAKE2b-256	`1374d117280e63cd8274724a43be4a29a44d415080705b7a8c357733da80a97a`

See more details on using hashes here.

dwe-core 0.1.0a1

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

dwe-core

How it works

Installation

Commands

dwe create-service

dwe update-service

dwe list-adapters

Adapter Registry (adapters.json)

How to Define a New Adapter

Step 1: Create the adapter repository

Step 2: Build a working application first

Step 3: Directory structure

Step 4: Write copier.yml

Step 5: Decide what needs Jinja2

Step 6: Write ci-templates/deploy.yaml

Step 7: Register the adapter

Step 8: Test the adapter

Adapter Versioning and Updates

State Files

dwe-state.json (DWE-managed)

.copier-answers.yml (Copier-managed)

Developer Workflow After create-service

CI/CD Workflow Design

The Two Paths

Job Summary

Required Secrets

SSM Prerequisites

Example: What Happens on a Typical Push

Adapting for Other Platforms

Adding a New Environment Later

Releasing to PyPI

One-time setup

Release flow

Technical Stack

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`dwe create-service`

`dwe update-service`

`dwe list-adapters`

Adapter Registry (`adapters.json`)

Step 4: Write `copier.yml`

Step 6: Write `ci-templates/deploy.yaml`

`dwe-state.json` (DWE-managed)

`.copier-answers.yml` (Copier-managed)

Developer Workflow After `create-service`