Skip to main content

Migration tooling from Google App Engine (webapp2, ndb) to python-cdd supported (FastAPI, SQLalchemy).

Project description

cdd-python-gae

Python version range Python implementation License Linting, testing, coverage, and release Tested OSs, others may work Documentation coverage codecov black Imports: isort PyPi: release

Migration tooling from Google App Engine (webapp2, ndb) to python-cdd supported (FastAPI, SQLalchemy).

Public SDK works with filenames, source code, and even in memory constructs (e.g., as imported into your REPL). CLI available also.

Install package

PyPi

pip install python-cdd-gae

Master

pip install -r https://raw.githubusercontent.com/offscale/cdd-python-gae/master/requirements.txt
pip install https://api.github.com/repos/offscale/cdd-python-gae/zipball#egg=cdd

Goal

Migrate from Google App Engine to cloud-independent runtime (e.g., vanilla CPython 3.11 with SQLite).

Relation to other projects

This was created independent of cdd-python project for two reasons:

  1. Unidirectional;
  2. Relevant to fewer people.

SDK

Approach

Traverse the AST for ndb and webapp2.

Advantages

Disadvantages

Alternatives

Minor other use-cases this facilitates

CLI for this project

$ python -m cdd_gae --help
usage: python -m cdd_gae [-h] [--version]
                         {ndb2sqlalchemy,ndb2sqlalchemy_migrator,webapp2_to_fastapi}
                         ...

Migration tooling from Google App Engine (webapp2, ndb) to python-cdd
supported (FastAPI, SQLalchemy).

positional arguments:
  {ndb2sqlalchemy,ndb2sqlalchemy_migrator,webapp2_to_fastapi}
    ndb2sqlalchemy      Parse NDB emit SQLalchemy
    ndb2sqlalchemy_migrator
                        Create migration scripts from NDB to SQLalchemy
    webapp2_to_fastapi  Parse WebApp2 emit FastAPI

optional arguments:
  -h, --help            show this help message and exit
  --version             show program's version number and exit

ndb2sqlalchemy (webapp2_to_fastapi takes same args)

$ python -m cdd_gae ndb2sqlalchemy --help

usage: python -m cdd_gae ndb2sqlalchemy [-h] -i INPUT_FILE -o OUTPUT_FILE
                                        [--dry-run]

options:
  -h, --help            show this help message and exit
  -i INPUT_FILE, --input-file INPUT_FILE
                        Python file to parse NDB `class`es out of
  -o OUTPUT_FILE, --output-file OUTPUT_FILE
                        Empty file to generate SQLalchemy classes to
  --dry-run             Show what would be created; don't actually write to
                        the filesystem.

webapp2_to_fastapi (ndb2sqlalchemy takes same args)

$ python -m cdd_gae webapp2_to_fastapi --help

usage: python -m cdd_gae webapp2_to_fastapi [-h] -i INPUT_FILE -o OUTPUT_FILE
                                            [--dry-run]

options:
  -h, --help            show this help message and exit
  -i INPUT_FILE, --input-file INPUT_FILE
                        Python file to parse WebApp2 `class`es out of
  -o OUTPUT_FILE, --output-file OUTPUT_FILE
                        Empty file to generate FastAPI functions to
  --dry-run             Show what would be created; don't actually write to
                        the filesystem.

python -m cdd_gae ndb2sqlalchemy_migrator --help

$ ndb2sqlalchemy_migrator
usage: python -m cdd_gae ndb2sqlalchemy_migrator [-h] --ndb-file NDB_FILE
                                                 --sqlalchemy-file
                                                 SQLALCHEMY_FILE
                                                 --ndb-mod-to-import
                                                 NDB_MOD_TO_IMPORT
                                                 --sqlalchemy-mod-to-import
                                                 SQLALCHEMY_MOD_TO_IMPORT -o
                                                 OUTPUT_FOLDER [--dry-run]

optional arguments:
  -h, --help            show this help message and exit
  --ndb-file NDB_FILE   Python file containing the NDB `class`es
  --sqlalchemy-file SQLALCHEMY_FILE
                        Python file containing the NDB `class`es
  --ndb-mod-to-import NDB_MOD_TO_IMPORT
                        NDB module name that the entity will be imported from
  --sqlalchemy-mod-to-import SQLALCHEMY_MOD_TO_IMPORT
                        SQLalchemy module name that the entity will be
                        imported from
  -o OUTPUT_FOLDER, --output-folder OUTPUT_FOLDER
                        Empty folder to generate scripts that migrate from one
                        NDB class to one SQLalchemy class
  --dry-run             Show what would be created; don't actually write to
                        the filesystem.

Data migration

The most efficient way seems to be:

  1. Backup from NDB to Google Cloud Storage
  2. Import from Google Cloud Storage to Google BigQuery
  3. Export from Google BigQuery to Apache Parquet files in Google Cloud Storage
  4. Download and parse the Parquet files, then insert into SQL

(for the following scripts set GOOGLE_PROJECT_ID, GOOGLE_BUCKET_NAME, NAMESPACE, GOOGLE_LOCATION)

Backup from NDB to Google Cloud Storage

for entity in kind0 kind1; do
  gcloud datastore export 'gs://'"$GOOGLE_BUCKET_NAME" --project "$GOOGLE_PROJECT_ID" --kinds "$entity" --async &
done

Import from Google Cloud Storage to Google BigQuery

printf 'bq mk "%s"\n' "$NAMESPACE" > migrate.bash
gsutil ls 'gs://'"$GOOGLE_BUCKET_NAME"'/**/all_namespaces/kind_*' | python3 -c 'import sys, posixpath, fileinput; f=fileinput.input(encoding="utf-8"); d=dict(map(lambda e: (posixpath.basename(posixpath.dirname(e)), posixpath.dirname(e)), sorted(f))); f.close(); print("\n".join(map(lambda k: "( bq mk \"'"$NAMESPACE"'.{k}\" && bq --location='"$GOOGLE_LOCATION"' load --source_format=DATASTORE_BACKUP \"'"$NAMESPACE"'.{k}\" \"{v}/all_namespaces_{k}.export_metadata\" ) &".format(k=k, v=d[k]), sorted(d.keys()))),sep="");' >> migrate.bash
# Then run `bash migrate.bash`

Export from Google BigQuery to Apache Parquet files in Google Cloud Storage

for entity in kind0 kind1; do
  bq extract --location="$GOOGLE_LOCATION" --destination_format='PARQUET' "$NAMESPACE"'.kind_'"$entity" 'gs://'"$GOOGLE_BUCKET_NAME"'/'"$entity"'/*' &
done

License

Licensed under either of

at your option.

Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

python-cdd-gae-0.0.8.tar.gz (28.9 kB view hashes)

Uploaded Source

Built Distribution

python_cdd_gae-0.0.8-py3-none-any.whl (37.3 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page