Skip to main content

GoMask CLI for configuration as code - manage synthetic data generation and masking routines

Project description

GoMask CLI

Configuration-as-Code for Synthetic Data Generation and Data Masking

GoMask CLI enables you to manage data generation and masking routines through YAML configuration files, bringing infrastructure-as-code principles to test data management.

Features

  • 📝 YAML-based Configuration: Define routines in version-controllable files
  • 🔐 Secure Authentication: Encrypted API secret contains all context
  • 🆔 Unique Identifiers: Human-readable IDs for reliable synchronization
  • 🚀 CI/CD Ready: Integrate with GitHub Actions, GitLab CI, Jenkins, etc.
  • ♻️ Bidirectional Sync: Import from and export to YAML format
  • 🔄 Environment Variables: Full support for variable substitution
  • 📊 Rich Output: Beautiful terminal output with progress tracking

Installation

Via pip

pip install gomask-cli

Via Poetry

poetry add gomask-cli

From Source

git clone https://github.com/gomask/gomask-cli.git
cd gomask-cli
pip install -e .

Quick Start

1. Initialize Configuration

Set up your GoMask CLI credentials from the web UI:

# Initialize CLI with your credentials (get secret from https://app.gomask.ai/settings/api-keys)
gomask init

# This creates a secure gomask.toml file in your current directory

2. Create a Database Connector

Connect to your database where you want to generate or mask data:

# Create a connector to your database
gomask connectors create \
  --name my-db \
  --type postgresql \
  --host localhost \
  --port 5432 \
  --database mydb \
  --username myuser

# The CLI will prompt for your password securely
# Or list existing connectors
gomask connectors list

3. Create a Routine (Recommended: Guided Setup)

Use the interactive wizard to automatically scan your database and generate a production-ready routine:

# Interactive wizard - automatically scans your database
gomask setup

# The wizard will:
# 1. Select your database connector
# 2. Introspect and display available schemas
# 3. Scan and list all tables in the selected schema
# 4. Detect foreign key relationships automatically
# 5. Let you choose tables (supports ranges like 1-5, or individual selections)
# 6. Ask for routine type (masking or synthetic data generation)
# 7. For masking: AI-powered detection of sensitive columns (PII, emails, etc.)
# 8. For synthetic: Automatically assign appropriate data generators per column
# 9. Generate and export a complete YAML configuration file

# Example output: routine_563_complete.yaml

The setup command creates a fully configured, production-ready routine with:

  • Correct table hierarchy based on foreign keys
  • Appropriate data generation functions for each column type
  • Sensitive data detection for masking routines
  • Valid YAML that's ready to run immediately

4. Edit Routine (Optional)

Customize the generated YAML configuration if needed:

# Open in your editor
vim routine_563_complete.yaml

# Common customizations:
# - Adjust record counts
# - Fine-tune generation functions
# - Add runtime parameters
# - Modify masking rules
# - Change data type mappings

# Validate your changes
gomask validate routine_563_complete.yaml

5. Run Your Routine

Execute the routine to generate or mask data:

# Execute the generated routine
gomask run routine_563_complete.yaml

# Run with live progress monitoring
gomask run routine_563_complete.yaml --watch

# Run with custom parameters
gomask run routine_563_complete.yaml --param RECORD_COUNT=5000 --watch

# Preview execution plan without running
gomask run routine_563_complete.yaml --dry-run

Command Reference

Command Description
Configuration
gomask init Initialize CLI configuration and credentials
gomask setup Interactive wizard to create routines (recommended)
gomask version Display version information
Routines
gomask routines list List all routines for your team
gomask routines show Display detailed routine information
gomask example Create example routine from template
gomask validate Validate YAML configuration file
gomask import Import YAML configuration to database
gomask export Export routine from database to YAML
gomask run Execute a routine from YAML configuration
Connectors
gomask connectors list List all database connectors
gomask connectors create Create a new database connector
gomask connectors show Display connector details
gomask connectors test Test connector connection
gomask connectors delete Delete a connector
Functions
gomask functions list List available data generation functions
gomask functions show Show detailed function information
gomask functions categories List function categories
gomask functions tags List function tags
Executions
gomask executions list List routine executions
gomask executions show Display execution details and progress
gomask executions logs Show execution logs
gomask executions cancel Cancel a running execution

Use gomask <command> --help for detailed options and examples.

Authentication

The CLI uses gomask init to securely store your credentials:

# Initialize configuration
gomask init

# This creates a gomask.toml file with your API secret
# Get your secret from: https://app.gomask.ai/settings/api-keys

Configuration Priority:

  1. Command-line flags (--secret, --api-url)
  2. gomask.toml configuration file (created by gomask init)

Important: Add gomask.toml to your .gitignore!

Environment Variables in YAML

YAML configuration files support environment variable substitution for sensitive data and runtime configuration:

Using Variables in YAML

connector:
  # Simple substitution
  host: ${DB_HOST}

  # With default value
  port: ${DB_PORT:-5432}

  # Required (errors if not set)
  password: ${DB_PASSWORD:?Password required}

  # Nested references
  database: ${ENV}_${DB_NAME}

Loading from .env Files

# .env file
DB_HOST=production.example.com
DB_PORT=5432
DB_PASSWORD=secure-password

# Use with CLI
gomask run routine.yaml --env-file .env

Note: Authentication credentials are stored in gomask.toml (created via gomask init), not in .env files.

YAML Configuration

Synthetic Data Generation

version: "1.0"
kind: SyntheticRoutine

metadata:
  id: "unique-routine-id"
  name: "Display Name"
  description: "Description"
  version: "1.0.0"

connector:
  ref: "connector-name" # Reference existing
  # OR inline:
  type: postgresql
  host: ${DB_HOST}
  port: 5432
  database: ${DB_NAME}
  username: ${DB_USER}
  password: ${DB_PASSWORD}

runtime_parameters:
  param_name:
    type: integer
    default: 1000
    description: "Parameter description"

tables:
  - name: table_name
    schema: public
    hierarchy_level: 0
    record_count: 1000
    columns:
      - name: column_name
        function: generation_function
        parameters:
          key: value

Data Masking

version: "1.0"
kind: MaskingRoutine

metadata:
  id: "unique-routine-id"
  name: "Display Name"

connector:
  ref: "connector-name"

masking:
  tables:
    - name: table_name
      schema: public
      columns:
        - name: column_name
          function: masking_function
          parameters:
            key: value

CI/CD Integration

GitHub Actions

name: Data Generation

on:
  schedule:
    - cron: "0 0 * * *"

jobs:
  generate:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2

      - name: Install GoMask CLI
        run: pip install gomask-cli

      - name: Setup GoMask
        run: |
          gomask init --secret ${{ secrets.GOMASK_SECRET }}

      - name: Run Data Generation
        run: |
          gomask validate routines/daily-data.yaml
          gomask run routines/daily-data.yaml --watch

GitLab CI

generate-test-data:
  stage: test
  script:
    - pip install gomask-cli
    - gomask init --secret $GOMASK_SECRET
    - gomask run test-data.yaml --param record_count=100
  variables:
    GOMASK_SECRET: $GOMASK_SECRET

Best Practices

  1. Use gomask setup - Always use the interactive wizard for creating new routines
  2. Version Control - Keep YAML files in git (exclude gomask.toml)
  3. Environment Variables - Use env vars for sensitive data
  4. Validate First - Always run gomask validate before importing
  5. Dry Runs - Use --dry-run to preview execution
  6. Test Small - Start with small record counts before scaling up

Troubleshooting

Authentication Issues

# Reinitialize configuration
gomask init

# Test connectivity
gomask version

YAML Validation Errors

# Validate with detailed output
gomask validate routine.yaml --detailed

# Check with environment variables
gomask validate routine.yaml --env-file .env

Execution Problems

# Preview execution plan
gomask run routine.yaml --dry-run

# Test connector
gomask connectors test my-connector

# Monitor execution with live logs
gomask run routine.yaml --watch
gomask executions logs 123 --follow

Support

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gomask_cli-0.0.1.tar.gz (68.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gomask_cli-0.0.1-py3-none-any.whl (88.1 kB view details)

Uploaded Python 3

File details

Details for the file gomask_cli-0.0.1.tar.gz.

File metadata

  • Download URL: gomask_cli-0.0.1.tar.gz
  • Upload date:
  • Size: 68.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for gomask_cli-0.0.1.tar.gz
Algorithm Hash digest
SHA256 a01955d7a4cd0466dfb8cbd722aeb2030e283a4db0bd98f1a10786db30efddfa
MD5 762d0123e95905b126329cf7e230d7a7
BLAKE2b-256 ec5c11bae756e5cd1819d0233751b75f06dd4c336a5a85578b92fb1096ba2f87

See more details on using hashes here.

File details

Details for the file gomask_cli-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: gomask_cli-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 88.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for gomask_cli-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 be04cd59900e3f6869db879000df392ec59da3aeb58531511d6ea173fcc3e75b
MD5 deee2311cd579d08475199174aa4bd9e
BLAKE2b-256 6082fb04eedf544bc9cc54700994f32231648822fafe8d156d7e148f2108918c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page