Declarative Data Orchestration
Project description
awk_plus_plus
A domain-specific language designed for data orchestration.
Features
- Fuzzy modern regex engine
- Semantic search
- Orthogonal Persistence based on DuckDB
- Transparent reference with Jsonnet. We plan to execute this future with Dask.
- URL interpreter to manage data sources.
Installation from pip
Install the package with:
pip install awk_plus_plus
CLI Usage
You output your data to JSON with the cti
command.
JSONNET support
Hello world
cti i "Hello world" -p -v 4
Jsonnet support
cti i '{"keys":: ["AWK", "SED", "SHELL"], "languages": [std.asciiLower(x) for x in self.keys]}'
URL interpreter
Our step further is the URL interpreter which allows you to manage different data sources with an unique syntax across a set of plugins.
STDIN, STDOUT, STDERR
cti i '{"lines": interpret("stream://stdin?strip=true")}'
Imap
cti i '{"emails": interpret("imap://USER:PASSWORD@HOST:993/INBOX")}'
Keyring
cti i '{"email":: interpret("keyring://backend/awk_plus_plus/email"), "emails": interpret($.email)}'
Files
cti i 'interpret("**/*.csv")'
SQL
cti i 'interpret("sql:SELECT * FROM email")'
Note
This project has been set up using PyScaffold 4.5 and the dsproject extension 0.0.post167+g4386552.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
awk_plus_plus-0.10.0.tar.gz
(34.6 kB
view hashes)
Built Distribution
Close
Hashes for awk_plus_plus-0.10.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5ad5d347bfa6d848e1ebae6299129f81c8a13d0c45cce0e92d9276630e70c605 |
|
MD5 | f9ae7341d161a96db247cf05873ead1b |
|
BLAKE2b-256 | f0c96088cdc08d705d3d2b752000624c188cea7bf77908f44b03f8109250976b |