A package for managing singer.io taps and targets
Project description
Alto (WIP)
A lightweight yet intelligent way to manage Singer based ELT.
How is this different than what exists today?
Using Meltano as the baseline of comparison, there are some noteworthy differences.
- Significantly smaller dependency footprint by an order of magnitude. Alto only has 4 direct dependencies with no C or rust extensions in the dependency tree. The below comparison includes transitives:
- Meltano: 151
- Alto: 7
- Because of its dependency footprint, it can be installed in very tiny containers and packaged formats such as
PEX
are cross platform compatible. It can also be used inpyodide
/wasm
. - We use
PEX
(PythonEXecutable) for all plugins instead of loose venvs making plugins single files that are straightforward to cache. - We use a (simple) caching algorithm that makes the plugins re-usable across machines when combined with a remote filesystem.
- We use
fsspec
to provide a filesystem abstraction layer that provides the exact same experience locally on a single machine as when plugged into a remote blob store such ass3
,gcs
, or any supportedfsspec
storage. - An order of magnitude (
>85%
) less code which makes iteration/maintenance or forking easier (in theory) - We use
Dynaconf
to manage configuration- This gives us uniform support for json, toml, and yaml out of the box
- We get environment management
- We get configuration inheritance / deep merging
- We get
.env
support - We get unique ways to render vars with
'@format
tokens
Meltano
───────────────────────────────────────────────────────────────────────────────
Language Files Lines Blanks Comments Code Complexity
───────────────────────────────────────────────────────────────────────────────
Python 154 26842 2402 4262 20178 1106
Alto
───────────────────────────────────────────────────────────────────────────────
Language Files Lines Blanks Comments Code Complexity
───────────────────────────────────────────────────────────────────────────────
Python 12 2892 226 164 2502 190
Example
An entire timed end-to-end example can be carried out via the below command.
From start to finish, it will:
- Create a directory
- Initialize an alto project (create the
alto.toml
file) - Run an extract -> load of an open API to target jsonl
- Build PEX plugins for
tap-carbon-intensity
andtarget-jsonl
- Dynamically generate config for the Singer plugin based on the toml file (supports toml/yaml/json)
- Run discovery and cache catalog to ~/.alto/(project-name)/catalog
- Apply user configuration to the catalog
- Run the pipeline
- Clean up the staging directory
- Manage and persist the state
- Build PEX plugins for
# Create a dir, init a project, run an end-2-end pipeline, show some output as proof
mkdir example_project \
&& cd $_; yes | alto init; \
time alto tap-carbon-intensity:target-jsonl; \
cat output/* | head -8; ls -l output; cd -; \
tree example_project
Resulting in the below output:
example_project
├── .alto
│ ├── logs
│ │ └── dev
│ └── plugins
│ ├── 263b729b56cf48f4bc3d08b687045ad3f81713ce
│ └── 60e33af4f316a41812ee404136d7a747011ba811
├── .alto.json
├── alto.secrets.toml
├── alto.toml
└── output
├── entry-20230228T205342.jsonl
├── generationmix-20230228T205342.jsonl
└── region-20230228T205342.jsonl
5 directories, 8 files
>>> cat alto.toml
[default]
project_name = "4c167d53"
extensions = []
namespace = "raw"
[default.taps.tap-carbon-intensity]
pip_url = "git+https://gitlab.com/meltano/tap-carbon-intensity.git#egg=tap_carbon_intensity"
namespace = "carbon_intensity"
capabilities = ["state", "catalog"]
select = ["*.*"]
[default.taps.tap-carbon-intensity.config]
[default.targets.target-jsonl]
pip_url = "target-jsonl==0.1.4"
[default.targets.target-jsonl.config]
destination_path = "output"
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
singer_alto-0.1.1.tar.gz
(29.4 kB
view hashes)
Built Distribution
Close
Hashes for singer_alto-0.1.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3c4de7e4832ff1ff64b29eb777d379abebd8e29aae16fe51a3c6a35b4ec83a97 |
|
MD5 | 34d68595630c36a74de074bea8ca88ea |
|
BLAKE2b-256 | 45a257316046d15a7e6e78582f4c0eccf0ee887f3585cf99c47730d2b6bbc554 |