Data Developer & Engineer IO Utility Objects
Project description
Data Utility Package: IO
Table of Contents:
This Utility IO Object was created for load
the config data from any file
format types like .yaml
, .json
, or .toml
, and manage retention and version
of this config file lifecycle.
Installation
pip install ddeutil-io
This package need to install ddeutil
for core package namespace.
Features
Files
File Objects that use to read or write data to that format.
For example, I will represent YamlEnvFl object that passing environment variable to reading content before passing to the Yaml loader.
data:
get: HELLO ${HELLO}
import os
from ddeutil.io import YamlEnvFl
os.environ["HELLO"] = "WORLD"
content = YamlEnvFl('./.pre-commit-config.yaml').read(safe=True)
assert content['data']['get'] == "HELLO WORLD"
Config
Config Object is the dir system handler object that manage any files in that
input dir path like load
, save
, load_stage
, save_stage
, or files
.
from pathlib import Path
from ddeutil.io.config import ConfFl
config: ConfFl = ConfFl(path=Path('./conf'), compress="gzip")
data = config.load('config_file.yaml')
config.save_stage('./stage/file.json', data)
Register
The Register Object is the metadata generator object for the config data. If you passing name and configs to this object, it will find the config name in any stage storage and generate its metadata to you.
from ddeutil.io.register import Register
from ddeutil.io.param import Params
registry: Register = Register(
name='examples:conn_data_local_file',
params=Params.model_validate({
"stages": {
"raw": {"format": "{naming:%s}.{timestamp:%Y%m%d_%H%M%S}"},
},
}),
)
registry.move(stage="raw")
The raw data of this config was written in conn_file.yaml
file.
conf/
examples/
conn_file.yaml
When call move
method, it will transfer data from .yaml
file to json
file
with the data hashing algorithm.
data/
raw/
conn_file_20240101_000000.json
Models
The Model objects was implemented from the Pydantic V2 which is the powerful parsing and serializing data model to the Python object.
[!NOTE] So, I use this project to learn and implement a limit and trick of the Pydantic package.
The model able to handle common logic validations and able to adjust by custom code
for your specific requirements (Yeah, it just inherits Sub-Class from BaseModel
).
Data Types
from ddeutil.io.models.dtype import StringType
dtype = StringType()
assert dtype.type == "string"
assert dtype.max_length == -1
Constraints
from ddeutil.io.models.const import Pk
const = Pk(of="foo", cols=["bar", "baz"])
assert const.name == "foo_bar_baz_pk"
assert const.cols == ["bar", "baz"]
Datasets
from ddeutil.io.models.datasets import Col, Tbl
tbl = Tbl(
name="table_foo",
features=[
Col(name="id", dtype="integer primary key"),
Col(name="foo", dtype="varchar( 10 )"),
],
)
assert tbl.name == "table_foo"
assert tbl.features[0].name == "id"
assert tbl.features[0].dtype.type == "integer"
Examples
If I have some catalog config, it easy to pass this config to model object.
import yaml
from ddeutil.io.models.datasets import Scm
config = yaml.safe_load("""
name: "warehouse"
tables:
- name: "customer_master"
features:
- name: "id"
dtype: "integer"
pk: true
- name: "name"
dtype: "varchar( 256 )"
nullable: false
""")
schema = Scm.model_validate(config)
assert len(schema.tables) == 1
assert schema.tables[0].name == 'customer_master'
License
This project was licensed under the terms of the MIT license.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for ddeutil_io-0.1.9-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | aa6e297c36f4981afb327063c2b621262f0ee5c73490005a129a8ee9cc306895 |
|
MD5 | 28578b636d36c9ec7c43c245760b090a |
|
BLAKE2b-256 | aecc2537d1896901e6788740ed0b7ef50f63a204a49ca3633e573f1343dfa060 |