Declarative Data Orchestration
Project description
awk_plus_plus
A domain-specific language designed for data orchestration.
Features
- Fuzzy modern regex engine
- Semantic search
- Orthogonal Persistence based on DuckDB
- Transparent reference with Jsonnet. We plan to execute this future with Dask.
- URL interpreter to manage data sources.
Installation from pip
Install the package with:
pip install awk_plus_plus
CLI Usage
You output your data to JSON with the cti
command.
JSONNET support
Hello world
cti i "Hello world" -p -v 4
Jsonnet support
cti i '{"keys":: ["AWK", "SED", "SHELL"], "languages": [std.asciiLower(x) for x in self.keys]}'
URL interpreter
Our step further is the URL interpreter which allows you to manage different data sources with an unique syntax across a set of plugins.
STDIN, STDOUT, STDERR
cti i '{"lines": interpret("stream://stdin?strip=true")}'
Imap
cti i '{"emails": interpret("imap://USER:PASSWORD@HOST:993/INBOX")}'
Keyring
cti i '{"email":: interpret("keyring://backend/awk_plus_plus/email"), "emails": interpret($.email)}'
Files
cti i 'interpret("**/*.csv")'
SQL
cti i 'interpret("sql:SELECT * FROM email")'
Note
This project has been set up using PyScaffold 4.5 and the dsproject extension 0.0.post167+g4386552.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
awk_plus_plus-0.11.0.tar.gz
(34.7 kB
view hashes)
Built Distribution
Close
Hashes for awk_plus_plus-0.11.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0eeac0b0574323ef4366680c08c00ad8e59eddbd9a44728b5ca0d78e8bfbb15a |
|
MD5 | c649206826c2ce7339ff7c54b8b92778 |
|
BLAKE2b-256 | 5a61bb79df9df4c7bd9a665123382dbf5f304eedf682aa5b20f02e58eba82a2f |