Reusable profiler and importer chassis for tabular migrations
Project description
migration-workbench
Reusable Django chassis for tabular workbook → app migrations: connectors pull from spreadsheets (Google Sheets) or Coda; profiling produces deterministic bundles; importers validate and apply with structured summaries; the workbook app turns profiles into schema-contract YAML for product repos to harden into real models.
PyPI: migration-workbench — pip install migration-workbench (import package migration_workbench uses underscores).
Who it is for
- Product teams moving messy spreadsheet truth into a maintainable Django app.
- Single-operator or small teams who want a repeatable pipeline (profile → contract → import) instead of one-off scripts.
- Django-adjacent adopters comfortable wiring
INSTALLED_APPS, env vars, and Fly-style SQLite hosting.
Three ways to use it
1. As a library (recommended for product repos)
Add the apps you need to INSTALLED_APPS and wire URLs/commands in your Django project. Set DJANGO_SETTINGS_MODULE to your project’s settings module (not migration_workbench.settings) in production. Depend on a released version, e.g. migration-workbench>=0.1.0,<1.
2. Scaffold a new product repo
From a sibling checkout of this repo:
make new-product PRODUCT=my-product # writes ../my-product
Then cd ../my-product && make install && make migrate && make check. The scaffold includes backend/, Makefile, Dockerfile (installs migration-workbench from PyPI), scripts/entrypoint_product.sh, SQLite/Fly-aligned settings (SQLITE_PATH, /healthz, WAL pragmas), and starter docs. Use --output-dir / --force on scripts/new_product.py for non-default paths.
3. Develop the chassis (this repo)
Clone, editable install, run the full gate:
python3 -m venv .venv
.venv/bin/pip install -e ".[dev]"
. ./.env.example # or create .env
.venv/bin/python manage.py migrate
make chassis-gate
Quickstart (PyPI)
python3 -m venv .venv
.venv/bin/pip install "migration-workbench[dev]" # omit [dev] if you skip pytest/black
Use wb on your PATH, or import apps (connectors, profiler, importer, workbook, deployment, …). For consumer repos installing the chassis next to your code: pip install -e ../migration-workbench — see profiler/README.md for profiling commands and importer/README.md for import authoring.
Core bundle commands (from a project with manage.py):
python manage.py pull_bundle --config docs/examples/live-config.example.json --output-dir /tmp/bundle
python manage.py snapshot_bundle --config docs/examples/offline-config.example.json --output-dir /tmp/bundle
python manage.py import_reference_example example_data --validate-only
Note: bundled migration_workbench.settings is for development; production hosts use their own settings module.
Architecture at a glance
Five Django apps:
| App | Role |
|---|---|
| connectors | Provider adapters (Sheets, Coda). |
| profiler | Read-only profiling → normalized bundle artifacts. |
| importer | BaseImportCommand chassis, preflight/apply, summary JSON. |
| workbook | scaffold_workbook_schema → schema-contract YAML. |
| deployment | Manifest validation, wb CLI (manifest lint, deploy dry-run). |
flowchart LR
sourceConfig[SourceConfigJSON] --> pullBundle[PullBundleCommand]
pullBundle --> providerRouter[ProviderRouter]
providerRouter --> adapters[GoogleSheets_or_Coda]
adapters --> rawRows[RawRows]
rawRows --> normalizer[SpreadsheetNormalizer]
normalizer --> bundle[NormalizedBundle]
bundle --> importer[BaseImportCommandSubclass]
importer --> summary[SummaryArtifactJSON]
More detail: docs/architecture.md.
The pipeline
- Intake — Source config (Drive folder, sheet IDs, Coda doc URLs).
- Profile — Profiler commands emit JSON/Markdown under
build/or product-owneddata/profile_snapshots/. - Model —
scaffold_workbook_schemaproduces schema-contract YAML for review. - Harden — Importer tiers validate then apply; summary artifacts record outcomes.
- Deploy —
wb manifest lintvalidates deploy/spaces.yml;wb deploy <space> --env <preview|production> --dry-runplans releases (provider mutation deferred — see docs/deployment.md).
Deployment
Fly.io + SQLite on a persistent volume + Litestream replication to Tigris or any S3-compatible bucket. Operator bootstrap, secrets, CI/CD, rollback, and roadmap for the wb control plane: docs/deployment.md.
CI/CD
| Workflow | File | Trigger | Role |
|---|---|---|---|
| CI | .github/workflows/ci.yml | push, PR | make chassis-gate, wheel smoke |
| Deploy | .github/workflows/deploy.yml | after successful CI (workflow_run) |
manifest lint → flyctl deploy → /healthz smoke (main → production, preview/* → preview) |
| Publish PyPI | .github/workflows/publish-pypi.yml | tag v* |
Trusted Publishing to PyPI |
GitHub repository secret FLY_API_TOKEN is required for Deploy. Product repos inherit CI patterns via make new-product scaffolding.
Status and roadmap
Stable on 0.x today
- Profiler (Google Sheets / Drive + Coda), importer chassis, workbook scaffolder.
wb manifest lint,wb deploy --dry-run, PyPI trusted publishing.- Self-hosted Fly path: Litestream + shared Tigris bucket,
fly.toml/fly.preview.toml, entrypoint migrations.
In flight
- Align default Git branch with Deploy workflow (
mainvsmaster). - Production Deploy workflow green end-to-end after secrets and Fly bootstrap.
Next
- Real
wb deploy(today:flyctl deploy+ manifest lint is the operator path). - Backup/restore drill documented and exercised for the workbench space.
- Google auth runbook evolution toward WIF (docs/google-auth.md).
- Scaffold-delivered CI/CD templates for client product repos.
Later
- Provider interface extraction after a second space is stable on Fly.
- Postgres mode where concurrent writes demand it.
Semantic versioning applies; 0.x may ship breaking changes — pin ranges in product repos.
Releases
- Bump
versioninpyproject.toml. - Tag
v+ version (must matchversion = "x.y.z"). - Trusted Publishing on PyPI for this repo (see publish workflow).
Manual upload: python -m build then twine upload dist/*, or make publish with maintainer credentials. Optional extras: [release] for build/twine only.
Documentation map
| Doc | Purpose |
|---|---|
| This README | Orientation, pipeline, roadmap |
| docs/architecture.md | Layered design |
| docs/deployment.md | Fly, secrets, Litestream/Tigris, CI/CD, control-plane roadmap |
| docs/schema-design-loop.md | Contract-first importer workflow |
| docs/google-auth.md | Sheets/Drive profiling auth |
| docs/coda.md | Coda profiling |
Per-package README.md under connectors/, profiler/, importer/, workbook/, deployment/ |
App-local surfaces |
Database modes
DB_ENGINE=sqlite(default)DB_ENGINE=postgreswithDB_NAME,DB_USER,DB_PASSWORD,DB_HOST,DB_PORT
License
See LICENSE.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file migration_workbench-0.1.2.tar.gz.
File metadata
- Download URL: migration_workbench-0.1.2.tar.gz
- Upload date:
- Size: 97.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b0713f7641c709157e830392e619c1965cff73de46edd960b0fa939e238e2428
|
|
| MD5 |
41d390c115537e96d2cc2d0fc8239bc1
|
|
| BLAKE2b-256 |
63974e57e119e476920cd41f6dbffca64b9a466ff731214cc439e6dcbc31e10a
|
Provenance
The following attestation bundles were made for migration_workbench-0.1.2.tar.gz:
Publisher:
publish-pypi.yml on MrAllatta/migration-workbench
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
migration_workbench-0.1.2.tar.gz -
Subject digest:
b0713f7641c709157e830392e619c1965cff73de46edd960b0fa939e238e2428 - Sigstore transparency entry: 1418409784
- Sigstore integration time:
-
Permalink:
MrAllatta/migration-workbench@a6b7c077233a7bdf0444331f20ccd46ff90851a3 -
Branch / Tag:
refs/tags/v0.1.2 - Owner: https://github.com/MrAllatta
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-pypi.yml@a6b7c077233a7bdf0444331f20ccd46ff90851a3 -
Trigger Event:
push
-
Statement type:
File details
Details for the file migration_workbench-0.1.2-py3-none-any.whl.
File metadata
- Download URL: migration_workbench-0.1.2-py3-none-any.whl
- Upload date:
- Size: 127.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
58557932d045a08663c80445968adf461db050471000ed6055f4ff9ac0d8e97f
|
|
| MD5 |
95380908f65ff0e8445e90e19d84b8fd
|
|
| BLAKE2b-256 |
2a9bd69abf6bfd0c3b27d427ce3e0d31526e03951f1c7c1b946bfb2212a8d2cc
|
Provenance
The following attestation bundles were made for migration_workbench-0.1.2-py3-none-any.whl:
Publisher:
publish-pypi.yml on MrAllatta/migration-workbench
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
migration_workbench-0.1.2-py3-none-any.whl -
Subject digest:
58557932d045a08663c80445968adf461db050471000ed6055f4ff9ac0d8e97f - Sigstore transparency entry: 1418409986
- Sigstore integration time:
-
Permalink:
MrAllatta/migration-workbench@a6b7c077233a7bdf0444331f20ccd46ff90851a3 -
Branch / Tag:
refs/tags/v0.1.2 - Owner: https://github.com/MrAllatta
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-pypi.yml@a6b7c077233a7bdf0444331f20ccd46ff90851a3 -
Trigger Event:
push
-
Statement type: