Skip to main content

Generic schema and database import tooling

Project description

Schema Import Wizard

Refactored import utility with object-oriented architecture, plugin-style database adapters, and unit tests.

Current status:

  • Production-ready flow for PostgreSQL.
  • Extension points for MySQL and MongoDB adapters are in place.
  • SQL transformation pipeline is modular and testable.

Architecture

The importer follows a layered OOP design:

  • ImportWizard: orchestration (staging, transformations, wrapper SQL, execution).
  • DatabaseAdapter: abstract base class for database-specific execution.
  • PostgresAdapter: concrete implementation using psql.
  • AdapterFactory: creates adapters from TARGET_DB or --target-db.
  • SqlTransformer: include-path rewrite, CSV path patching, and COPY warning checks.

Features

  • Cross-platform path handling (Windows/Linux compatible with POSIX output paths).
  • Deterministic SQL execution order (SQL_ORDER).
  • Isolated staging folder per run (.schema_import_staging//).
  • CSV path normalization for SQL COPY and \copy statements.
  • Stronger validation:
    • Missing SQL files fail fast.
    • Missing or ambiguous CSV references fail fast.
    • Non-zero command failures propagate as process exit codes.
  • Console plus rotating file logging.
  • Dry-run support.

Configuration

Use .env (see .env.example):

PGHOST=localhost
PGPORT=5432
PGDATABASE=example_db
PGUSER=postgres
PGPASSWORD=your_password

SCHEMA_IMPORT_ROOT=./inputs
SQL_ORDER=schema_setup.sql,insert_postgre_queries.sql,pg_vector.sql

LOG_LEVEL=DEBUG
TARGET_DB=postgres

# Optional: explicit path to psql executable (useful on Windows)
# PSQL_BIN=C:/Program Files/PostgreSQL/16/bin/psql.exe

# Optional: continue import when pgvector binaries are not installed on server
# ALLOW_MISSING_PGVECTOR=true

INI profile support:

  • All CLI surfaces now support --config-file and --config-section.
  • Resolution order is: explicit CLI args > INI profile values > environment > built-in defaults.
  • Import can read engine aliases from INI (for example: postgresql, mongo, sql server, sqlite3).
  • Cleanup and check connection are PostgreSQL-only commands; non-PostgreSQL profile engines are rejected with a clear error.
  • Sample multi-database INI file: sample_db_profiles.ini.

One Command Bootstrap (Cross-Platform)

Use one script on all platforms:

python bootstrap.py

What it does:

  1. Creates .venv if missing.
  2. Upgrades pip in .venv.
  3. Installs dependencies from requirements.txt inside .venv.
  4. Runs unit tests using .venv interpreter.

Useful options:

python bootstrap.py --skip-tests
python bootstrap.py --recreate-venv
python bootstrap.py --skip-install
python bootstrap.py --skip-install --coverage
python bootstrap.py --skip-install --coverage --cov-fail-under 100

Coverage mode enforces a hard gate.

  • Default gate is 100%.
  • The command fails if coverage is below the configured threshold.
  • Coverage is scoped to schema_import.py for gate evaluation.

Legacy helper scripts now delegate to bootstrap.py to keep one execution path:

  • Windows: 001_env.bat, 003_setup.bat, 005_run_test.bat, 005_run_code_cov.bat
  • Unix: 001_env.sh, 003_setup.sh, 005_run_test.sh, 005_run_code_cov.sh

Usage

Run importer dry-run:

python schema_import.py --dry-run

Execute import:

python schema_import.py

Select adapter explicitly:

python schema_import.py --target-db postgres

Run import using an INI profile:

python schema_import.py --config-file import_schema.ini --config-section STAGE_POSTGRES

Cleanup local schemas and database (dev box):

Use the cleanup CLI entrypoint that ships with the repository.

The cleanup command is destructive and always requires explicit confirmation. When prompted, type YES to proceed. For safety, protected databases (postgres, template0, template1) are blocked and cannot be dropped.

Cleanup dry-run:

Dry-run mode lets you preview the actions first.

python cleanup_schema.py --dry-run

Override target database or schema list:

Use this when you need to target a different database or schema set.

python cleanup_schema.py --target-db myapp_db --schemas myapp,myapp_vector --dry-run

Run cleanup using an INI profile:

python cleanup_schema.py --config-file import_schema.ini --config-section STAGE_POSTGRES --dry-run

Wrapper scripts:

  • Windows: 006_cleanup_schema.bat
  • Unix: 006_cleanup_schema.sh

Check connectivity and authorization:

Wrapper scripts:

  • Windows: 009_check_connection.bat
  • Unix: 009_check_connection.sh

Check connection for import authorization:

python check_connection.py --mode import --target-db myapp_db --schemas myapp,myapp_vector

Check connection using an INI profile:

python check_connection.py --config-file import_schema.ini --config-section STAGE_POSTGRES --mode import

Run full INI workflow test using .env values (connection + schema details):

Windows:

.\010_test_ini_workflow.bat

Unix:

./010_test_ini_workflow.sh

Note:

  • The sample INI sets cleanup_target_db=myapp_db to avoid destructive operations against protected system databases such as postgres during cleanup dry-runs.

Optional profile override:

.\010_test_ini_workflow.bat POSTGRES_FROM_ENV
./010_test_ini_workflow.sh POSTGRES_FROM_ENV

MCP Server

This project includes an MCP server implementation with exactly these tools:

  • import_schema
  • cleanup_schema
  • check_connection

Start MCP server directly:

python mcp_server.py

Start using installed entrypoint:

schema-mcp-server

Wrapper scripts:

  • Windows: 011_run_mcp_server.bat
  • Unix: 011_run_mcp_server.sh

NLP Examples (Chat/Copilot)

Use .env source examples:

  • import_schema:
    • "Use .env and import schema in dry run mode."
    • "Use .env and run import schema now (not dry run)."
  • cleanup_schema:
    • "Use .env and cleanup myapp_db with schemas myapp,myapp_vector in dry run."
    • "Use .env and cleanup myapp_db for real; confirm cleanup yes."
  • check_connection:
    • "Use .env and check connection for import mode."
    • "Use .env and check connection for cleanup mode with schemas myapp,myapp_vector."

Use .ini source examples:

  • import_schema:
    • "Import schema using sample_db_profiles.ini section [POSTGRES_FROM_ENV] in dry run."
    • "Use sample_db_profiles.ini profile POSTGRES_FROM_ENV and run import schema."
  • cleanup_schema:
    • "Cleanup schema using sample_db_profiles.ini section [POSTGRES_FROM_ENV] in dry run."
    • "Use sample_db_profiles.ini profile POSTGRES_FROM_ENV and cleanup myapp_db."
  • check_connection:
    • "Check connection using sample_db_profiles.ini section [POSTGRES_FROM_ENV] for import mode."
    • "Use sample_db_profiles.ini profile POSTGRES_FROM_ENV and check cleanup authorization."

Notes:

  • If .env or selected INI section has required fields, no follow-up questions are needed.
  • Precedence is explicit values in request > selected INI profile > .env > defaults.
  • For destructive cleanup (dryRun=false), explicit confirmation is required.

Check connection for cleanup authorization:

python check_connection.py --mode cleanup --target-db myapp_db --schemas myapp,myapp_vector

Entry-point command (after package install):

schema-check-connection --mode import --target-db myapp_db --schemas myapp,myapp_vector

Wrapper examples:

.\009_check_connection.bat --mode import --target-db myapp_db --schemas myapp,myapp_vector
./009_check_connection.sh --mode cleanup --target-db myapp_db --schemas myapp,myapp_vector

PostgreSQL Client Resolution

Real PostgreSQL execution requires psql.

  • The importer first checks PSQL_BIN (if set).
  • Otherwise it resolves psql from your system PATH.
  • If not found, it fails with a clear message.
  • If target PGDATABASE does not exist yet, importer retries once against postgres for bootstrap scripts.

pgvector optional fallback:

  • If ALLOW_MISSING_PGVECTOR=true and server reports extension "vector" is not available, importer retries once without pg_vector.sql.

Dry-run behavior:

  • --dry-run does not require psql to be installed.
  • Non-dry runs require a valid psql executable.

Examples:

$env:PSQL_BIN = 'C:\Program Files\PostgreSQL\16\bin\psql.exe'
.\004_run.bat
export PSQL_BIN="/usr/lib/postgresql/16/bin/psql"
./004_run.sh

Test Suite

Run tests manually:

python -m unittest discover -s tests -p "test_*.py"

Local Security Reports (HTML)

Generate local audit/security reports and convert them to HTML using sec_report_kit.

Wrapper scripts:

  • Windows: 007_run_security_reports.bat
  • Unix: 007_run_security_reports.sh

Outputs are written under test_reports/security_reports/:

  • pip_audit.json
  • pip_audit.html
  • trivy_fs.json (when Trivy is installed)
  • trivy_fs.html (when Trivy is installed)

Examples:

.\007_run_security_reports.bat
./007_run_security_reports.sh

Optional flags:

python security_reports.py --skip-trivy
python security_reports.py --output-dir test_reports/security_reports --target schema_importer

Extending to Other Databases

To support MySQL and MongoDB next:

  1. Implement execute(...) in MySqlAdapter and MongoDbAdapter.
  2. Add database-specific SQL transformer or converter stage where syntax differs.
  3. Add adapter-focused unit tests and integration tests.

For now, PostgreSQL is the supported and validated runtime path.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

schema_importer-0.4.0.tar.gz (31.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

schema_importer-0.4.0-py3-none-any.whl (32.6 kB view details)

Uploaded Python 3

File details

Details for the file schema_importer-0.4.0.tar.gz.

File metadata

  • Download URL: schema_importer-0.4.0.tar.gz
  • Upload date:
  • Size: 31.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for schema_importer-0.4.0.tar.gz
Algorithm Hash digest
SHA256 1de1fa1f286d6fbd344ab9441aa59f1d115a55e904773e2519cae76ef9480558
MD5 1eb46d6b079db30064bc360a4e084579
BLAKE2b-256 f7fd4fc1bf7268937c3f99cdb72a5dee520698a0c59a5b31ca1bc30d9d51b1b3

See more details on using hashes here.

Provenance

The following attestation bundles were made for schema_importer-0.4.0.tar.gz:

Publisher: publish-pypi.yml on ShanKonduru/schema_importer

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file schema_importer-0.4.0-py3-none-any.whl.

File metadata

  • Download URL: schema_importer-0.4.0-py3-none-any.whl
  • Upload date:
  • Size: 32.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for schema_importer-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f4ffd2832aa007199cd81c177e3ca4060b1fff798ff0c4f3594acb077876c1fd
MD5 7238fce03119bd76102198d50e6f85d9
BLAKE2b-256 eb38ea0da525facd061e5e138830ba0a85335c0c02c015c0a594dae2ab7185f1

See more details on using hashes here.

Provenance

The following attestation bundles were made for schema_importer-0.4.0-py3-none-any.whl:

Publisher: publish-pypi.yml on ShanKonduru/schema_importer

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page