It's data based!
Project description
Limber Timber
It's data based!
pip install limber-timber
Overview
I am writing the migration system I always wanted but does not exist (yet).
Docs
https://Wopple.github.io/limber-timber
Roadmap
These are listed in rough priority order.
- ✅ CLI
- ✅ Publish to PyPI
- ✅ Templating
- ➡️ Documentation
- ➡️ Unit Tests
- ➡️ Templating
- ➡️ JSON Schema
- To validate and auto complete migration files in IDEs
- ✅ In-memory Database
- ✅ In-memory Metadata
- ➡️ Big Query Database
- ➡️ Create Snapshot Table
- ➡️ Create Table Clone
- ✅ Big Query Metadata
- ✅ Database Adoption
- ✅ Raise Unsupported Operations
- ✅ Scan Topologically with Foreign Keys
- ✅ Database Specific Validation
- ➡️ Github Actions
- ➡️ Release
- ➡️ Expand Grouped Operations
- To handle complex operations that do not have atomic support in the backend
- ➡️ Grouped Operation Application
- To reduce round trips with the backend and reduce migration time
- ✅ Minimize Scan Output
- ✅ Arbitrary DML SQL Migrations
- ➡️ Optional Backend Installation
- To minimize dependency bloat
- ➡️ File System Metadata
- ➡️ SQLite Database
- ➡️ SQLite Metadata
- ➡️ Postgres Database
- ➡️ Postgres Metadata
- ➡️ MySQL Database
- ➡️ MySQL Metadata
Usage
Create Migrations
- Create your target manifest
Note: All migration files can use any of these extensions:
.json.yaml.yml
Create a target directory with a manifest file named manifest.yaml.
# target_dir/manifest.yaml
version: 1
operation_files:
- path/to/create_user_table.yaml
- path/to/enrich_user_name.yaml
- Create your target migration operations
Tip: Using a subdirectory for the operations files makes it easy to configure your IDE to apply the correct JSON schema.
Create the files listed in your manifest.
# target_dir/path/to/create_user_table.yaml
version: 1
operations:
- kind: create_table
data:
table:
name:
database: your_project
schema_name: your_dataset
table_name: users
columns:
- name: id
datatype: INT64
- name: name
datatype: STRING
# target_dir/path/to/enrich_user_name.yaml
version: 1
operations:
- kind: rename_column
data:
table_name:
database: your_project
schema_name: your_dataset
table_name: users
from_name: name
to_name: firstname
- kind: add_column
data:
table_name:
database: your_project
schema_name: your_dataset
table_name: users
column:
name: lastname
datatype: STRING
- Check what migrations will run
poetry run liti migrate \
-t target_dir \
--db bigquery \
--meta bigquery \
--meta-table-name your_project.your_dataset._migrations
- Run the migrations
poetry run liti migrate -w \
-t target_dir \
--db bigquery \
--meta bigquery \
--meta-table-name your_project.your_dataset._migrations
Scan Database
You can also scan a schema / table which will print out the operations file that generates that schema / table.
# scan a schema
poetry run liti scan \
--db bigquery \
--scan-database your_project \
--scan-schema your_dataset
# scan a table
poetry run liti scan \
--db bigquery \
--scan-database your_project \
--scan-schema your_dataset \
--scan-table your_table
Learn
Being completely new to this project, you will have no idea where to start. Here. This is where you start. This is a crash course on what Limber Timber is and how its put together.
The Big Picture
Limber Timber uses the Operation to describe changes to a database. These operations are pure data. They can be
serialized to JSON or YAML, and can be deserialized from the same. Developers write JSON or YAML files to describe the
migrations for their application.
The Operation can be enhanced to become an OperationOps. This type brings behavior to the data. It allows you to:
- check if the operation has been applied to the database
- useful for recovery from a failure between applying an operation and writing it to the metadata
- apply the operation, i.e. the "up" migration
- produce the inverse
Operationthat will perform the "down" migration
Down migrations are inferred from the up migrations, so developers only ever have to write the up migrations.
Migration Files
Migration files start with a manifest file. The manifest points to the operation files in the order they should be applied. Each operation file contains a list of operations in the order they should be applied. In this way,
# file1
[op1, op2]
# file2
[op3]
is exactly the same as:
# file1
[op1]
# file2
[op2, op3]
The migrational unit is the Operation, not the file. Grouping operations into files can help for organization, but
having a single file with all operations or many files each with one operation are both valid. There are no checksums
and no need to specially name your files. You can also organize your migrations with sub-directories, just specify the
paths in the manifest.
One major benefit to this system is if parallel developers add operations, one will merge first, and then the other will get a merge conflict. This is much better than having migrations applied out of order (or breaking) after the fact. You learn right away about the conflict, and the developer is prompted to resolve it. This benefit assumes all developers are using the same style for adding new migrations: either adding a new file to the manifest, or adding a new operation to the most recent file.
Python Modules
liti.core.model
This module stores all the data models. The models are versioned, though currently there is only the one version. The hierarchy is roughly:
operation.ops>operation.data>schema>datatype
liti.core.model.v1.operation.data
These are the pure data operations. They are (de)serialized between the operation files and metadata.
liti.core.model.v1.operation.ops
These are the wrappers that enhance operations with behavior. There is a 1:1 relationship.
liti.core.model.v1.datatype
These are descriptions of column types.
liti.core.model.v1.schema
These are descriptions of tables and related constructs.
liti.core.backend
Both the database and the metadata can support different backends. You can even use different backends together. The
backends deal in both the liti model and backend specific types adapting between the two.
liti.core.client
These are clients used by the backends. They solely deal in backend specific types with no dependencies on the liti
model.
liti.core.base
This module has base classes for applying default values and validating the data. They are implemented using the Observer / Observable pattern so different backends can define their own behavior. Also implements the templating engine.
liti.core.runner
This module is for the runners associated with the various ways liti can be run. Main code will instantiate a runner
and run it.
Contribution
If you want to contribute, the roadmap is a good place to start. I will only accept contributions if:
- I agree with the design decisions
- The code style matches the existing code
It is highly recommended but not necessary to:
- Include unit tests
If you have any questions, you can reach out to me on discord.
Design Principles
- The default behavior is safe and automated
- The behavior can be configured to be fast and efficient
- High flexibility to support future and unknown use-cases
- Prefer supporting narrow use cases well rather than broad use cases poorly
- Apply heavy importance to the Single Responsibility Principle
- Put complex logic in easily tested functions
Code Style
- 4-space indentation
- Prefer single quotes
- exceptions
pyproject.toml- docstrings
- nested f-strings
- exceptions
- Use newlines to visually separate blocks and conceptual groups of code
- Include explicit
elseblocks- exceptions
- assertive if-statements
- exceptions
- Naming
- balance brevity and clarity: say exactly what is needed
- do not restate what is already clear from the context
- Comments
- dos
- clarify confusing code
- explain the 'why'
- first try to explain with the code instead of a comment
- do nots
- make assumptions about the reader
- state that which is explained by the surrounding code
- cover up for poor code
- just because
- dos
- Multiline strings use concatenated single line strings
- exceptions
- docstrings
- exceptions
- No
from my.module import *- instead:
from my import module as md
- instead:
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file limber_timber-0.2.0.tar.gz.
File metadata
- Download URL: limber_timber-0.2.0.tar.gz
- Upload date:
- Size: 44.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.1.2 CPython/3.12.2 Darwin/24.6.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
93f85cc66065bf0e848a6af1410bdf52bd6cf0e57774cb04a4d2f64409656d55
|
|
| MD5 |
638d4d228453adec0f1bdca08b8d5cb0
|
|
| BLAKE2b-256 |
d50d5de2be6082b5e48ce5d301c1e22770aefa59904ed1960d2c00e2ebef3d4d
|
File details
Details for the file limber_timber-0.2.0-py3-none-any.whl.
File metadata
- Download URL: limber_timber-0.2.0-py3-none-any.whl
- Upload date:
- Size: 54.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.1.2 CPython/3.12.2 Darwin/24.6.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7823bb9612b1071ebe63e5a24520db07635329f279a07d61e6b1567d64d56299
|
|
| MD5 |
d85dcc06a382be3977a57a4e9786dcd2
|
|
| BLAKE2b-256 |
3b96cd97df95717af6b0791398062a93a1db17531f26f72d845014d751fe30de
|