Generic schema and database import tooling
Project description
Schema Import Wizard
Refactored import utility with object-oriented architecture, plugin-style database adapters, and unit tests.
Current status:
- Production-ready flow for PostgreSQL.
- Extension points for MySQL and MongoDB adapters are in place.
- SQL transformation pipeline is modular and testable.
Architecture
The importer follows a layered OOP design:
- ImportWizard: orchestration (staging, transformations, wrapper SQL, execution).
- DatabaseAdapter: abstract base class for database-specific execution.
- PostgresAdapter: concrete implementation using psql.
- AdapterFactory: creates adapters from TARGET_DB or --target-db.
- SqlTransformer: include-path rewrite, CSV path patching, and COPY warning checks.
Features
- Cross-platform path handling (Windows/Linux compatible with POSIX output paths).
- Deterministic SQL execution order (SQL_ORDER).
- Isolated staging folder per run (.schema_import_staging//).
- CSV path normalization for SQL COPY and \copy statements.
- Stronger validation:
- Missing SQL files fail fast.
- Missing or ambiguous CSV references fail fast.
- Non-zero command failures propagate as process exit codes.
- Console plus rotating file logging.
- Dry-run support.
Configuration
Use .env (see .env.example):
PGHOST=localhost
PGPORT=5432
PGDATABASE=example_db
PGUSER=postgres
PGPASSWORD=your_password
SCHEMA_IMPORT_ROOT=./inputs
SQL_ORDER=schema_setup.sql,insert_postgre_queries.sql,pg_vector.sql
LOG_LEVEL=DEBUG
TARGET_DB=postgres
# Optional: explicit path to psql executable (useful on Windows)
# PSQL_BIN=C:/Program Files/PostgreSQL/16/bin/psql.exe
# Optional: continue import when pgvector binaries are not installed on server
# ALLOW_MISSING_PGVECTOR=true
INI profile support:
- All CLI surfaces now support
--config-fileand--config-section. - Resolution order is: explicit CLI args > INI profile values > environment > built-in defaults.
- Import can read engine aliases from INI (for example:
postgresql,mongo,sql server,sqlite3). - Cleanup and check connection are PostgreSQL-only commands; non-PostgreSQL profile engines are rejected with a clear error.
- Sample multi-database INI file:
sample_db_profiles.ini.
One Command Bootstrap (Cross-Platform)
Use one script on all platforms:
python bootstrap.py
What it does:
- Creates .venv if missing.
- Upgrades pip in .venv.
- Installs dependencies from requirements.txt inside .venv.
- Runs unit tests using .venv interpreter.
Useful options:
python bootstrap.py --skip-tests
python bootstrap.py --recreate-venv
python bootstrap.py --skip-install
python bootstrap.py --skip-install --coverage
python bootstrap.py --skip-install --coverage --cov-fail-under 100
Coverage mode enforces a hard gate.
- Default gate is
100%. - The command fails if coverage is below the configured threshold.
- Coverage is scoped to
schema_import.pyfor gate evaluation.
Legacy helper scripts now delegate to bootstrap.py to keep one execution path:
- Windows:
001_env.bat,003_setup.bat,005_run_test.bat,005_run_code_cov.bat - Unix:
001_env.sh,003_setup.sh,005_run_test.sh,005_run_code_cov.sh
Usage
Run importer dry-run:
python schema_import.py --dry-run
Execute import:
python schema_import.py
Select adapter explicitly:
python schema_import.py --target-db postgres
Run import using an INI profile:
python schema_import.py --config-file import_schema.ini --config-section STAGE_POSTGRES
Cleanup local schemas and database (dev box):
Use the cleanup CLI entrypoint that ships with the repository.
The cleanup command is destructive and always requires explicit confirmation.
When prompted, type YES to proceed.
For safety, protected databases (postgres, template0, template1) are blocked and cannot be dropped.
Cleanup dry-run:
Dry-run mode lets you preview the actions first.
python cleanup_schema.py --dry-run
Override target database or schema list:
Use this when you need to target a different database or schema set.
python cleanup_schema.py --target-db myapp_db --schemas myapp,myapp_vector --dry-run
Run cleanup using an INI profile:
python cleanup_schema.py --config-file import_schema.ini --config-section STAGE_POSTGRES --dry-run
Wrapper scripts:
- Windows:
006_cleanup_schema.bat - Unix:
006_cleanup_schema.sh
Check connectivity and authorization:
Wrapper scripts:
- Windows:
009_check_connection.bat - Unix:
009_check_connection.sh
Check connection for import authorization:
python check_connection.py --mode import --target-db myapp_db --schemas myapp,myapp_vector
Check connection using an INI profile:
python check_connection.py --config-file import_schema.ini --config-section STAGE_POSTGRES --mode import
Run full INI workflow test using .env values (connection + schema details):
Windows:
.\010_test_ini_workflow.bat
Unix:
./010_test_ini_workflow.sh
Note:
- The sample INI sets
cleanup_target_db=myapp_dbto avoid destructive operations against protected system databases such aspostgresduring cleanup dry-runs.
Optional profile override:
.\010_test_ini_workflow.bat POSTGRES_FROM_ENV
./010_test_ini_workflow.sh POSTGRES_FROM_ENV
MCP Server
This project includes an MCP server implementation with exactly these tools:
import_schemacleanup_schemacheck_connection
Start MCP server directly:
python mcp_server.py
Start using installed entrypoint:
schema-mcp-server
Wrapper scripts:
- Windows:
011_run_mcp_server.bat - Unix:
011_run_mcp_server.sh
NLP Examples (Chat/Copilot)
Use .env source examples:
- import_schema:
- "Use .env and import schema in dry run mode."
- "Use .env and run import schema now (not dry run)."
- cleanup_schema:
- "Use .env and cleanup myapp_db with schemas myapp,myapp_vector in dry run."
- "Use .env and cleanup myapp_db for real; confirm cleanup yes."
- check_connection:
- "Use .env and check connection for import mode."
- "Use .env and check connection for cleanup mode with schemas myapp,myapp_vector."
Use .ini source examples:
- import_schema:
- "Import schema using sample_db_profiles.ini section [POSTGRES_FROM_ENV] in dry run."
- "Use sample_db_profiles.ini profile POSTGRES_FROM_ENV and run import schema."
- cleanup_schema:
- "Cleanup schema using sample_db_profiles.ini section [POSTGRES_FROM_ENV] in dry run."
- "Use sample_db_profiles.ini profile POSTGRES_FROM_ENV and cleanup myapp_db."
- check_connection:
- "Check connection using sample_db_profiles.ini section [POSTGRES_FROM_ENV] for import mode."
- "Use sample_db_profiles.ini profile POSTGRES_FROM_ENV and check cleanup authorization."
Notes:
- If
.envor selected INI section has required fields, no follow-up questions are needed. - Precedence is
explicit values in request > selected INI profile > .env > defaults. - For destructive cleanup (
dryRun=false), explicit confirmation is required.
Check connection for cleanup authorization:
python check_connection.py --mode cleanup --target-db myapp_db --schemas myapp,myapp_vector
Entry-point command (after package install):
schema-check-connection --mode import --target-db myapp_db --schemas myapp,myapp_vector
Wrapper examples:
.\009_check_connection.bat --mode import --target-db myapp_db --schemas myapp,myapp_vector
./009_check_connection.sh --mode cleanup --target-db myapp_db --schemas myapp,myapp_vector
PostgreSQL Client Resolution
Real PostgreSQL execution requires psql.
- The importer first checks
PSQL_BIN(if set). - Otherwise it resolves
psqlfrom your systemPATH. - If not found, it fails with a clear message.
- If target
PGDATABASEdoes not exist yet, importer retries once againstpostgresfor bootstrap scripts.
pgvector optional fallback:
- If
ALLOW_MISSING_PGVECTOR=trueand server reportsextension "vector" is not available, importer retries once withoutpg_vector.sql.
Dry-run behavior:
--dry-rundoes not requirepsqlto be installed.- Non-dry runs require a valid
psqlexecutable.
Examples:
$env:PSQL_BIN = 'C:\Program Files\PostgreSQL\16\bin\psql.exe'
.\004_run.bat
export PSQL_BIN="/usr/lib/postgresql/16/bin/psql"
./004_run.sh
Test Suite
Run tests manually:
python -m unittest discover -s tests -p "test_*.py"
Local Security Reports (HTML)
Generate local audit/security reports and convert them to HTML using sec_report_kit.
Wrapper scripts:
- Windows:
007_run_security_reports.bat - Unix:
007_run_security_reports.sh
Outputs are written under test_reports/security_reports/:
pip_audit.jsonpip_audit.htmltrivy_fs.json(when Trivy is installed)trivy_fs.html(when Trivy is installed)
Examples:
.\007_run_security_reports.bat
./007_run_security_reports.sh
Optional flags:
python security_reports.py --skip-trivy
python security_reports.py --output-dir test_reports/security_reports --target schema_importer
Extending to Other Databases
To support MySQL and MongoDB next:
- Implement execute(...) in MySqlAdapter and MongoDbAdapter.
- Add database-specific SQL transformer or converter stage where syntax differs.
- Add adapter-focused unit tests and integration tests.
For now, PostgreSQL is the supported and validated runtime path.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file schema_importer-0.4.0.tar.gz.
File metadata
- Download URL: schema_importer-0.4.0.tar.gz
- Upload date:
- Size: 31.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1de1fa1f286d6fbd344ab9441aa59f1d115a55e904773e2519cae76ef9480558
|
|
| MD5 |
1eb46d6b079db30064bc360a4e084579
|
|
| BLAKE2b-256 |
f7fd4fc1bf7268937c3f99cdb72a5dee520698a0c59a5b31ca1bc30d9d51b1b3
|
Provenance
The following attestation bundles were made for schema_importer-0.4.0.tar.gz:
Publisher:
publish-pypi.yml on ShanKonduru/schema_importer
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
schema_importer-0.4.0.tar.gz -
Subject digest:
1de1fa1f286d6fbd344ab9441aa59f1d115a55e904773e2519cae76ef9480558 - Sigstore transparency entry: 1549236407
- Sigstore integration time:
-
Permalink:
ShanKonduru/schema_importer@c12ceab27b8aba73b32354bd389fae0e9b07b2e5 -
Branch / Tag:
refs/tags/v0.4.0 - Owner: https://github.com/ShanKonduru
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-pypi.yml@c12ceab27b8aba73b32354bd389fae0e9b07b2e5 -
Trigger Event:
push
-
Statement type:
File details
Details for the file schema_importer-0.4.0-py3-none-any.whl.
File metadata
- Download URL: schema_importer-0.4.0-py3-none-any.whl
- Upload date:
- Size: 32.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f4ffd2832aa007199cd81c177e3ca4060b1fff798ff0c4f3594acb077876c1fd
|
|
| MD5 |
7238fce03119bd76102198d50e6f85d9
|
|
| BLAKE2b-256 |
eb38ea0da525facd061e5e138830ba0a85335c0c02c015c0a594dae2ab7185f1
|
Provenance
The following attestation bundles were made for schema_importer-0.4.0-py3-none-any.whl:
Publisher:
publish-pypi.yml on ShanKonduru/schema_importer
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
schema_importer-0.4.0-py3-none-any.whl -
Subject digest:
f4ffd2832aa007199cd81c177e3ca4060b1fff798ff0c4f3594acb077876c1fd - Sigstore transparency entry: 1549236473
- Sigstore integration time:
-
Permalink:
ShanKonduru/schema_importer@c12ceab27b8aba73b32354bd389fae0e9b07b2e5 -
Branch / Tag:
refs/tags/v0.4.0 - Owner: https://github.com/ShanKonduru
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-pypi.yml@c12ceab27b8aba73b32354bd389fae0e9b07b2e5 -
Trigger Event:
push
-
Statement type: