Shell pipes for Python.
Project description
Mario: Shell pipes in Python
Your favorite plumbing snake 🐍🔧 with your favorite pipes, right in your shell 🐢.
Installation
Mario
Get Mario with pip:
python3.7 -m pip install mario
If you’re not inside a virtualenv, you might get a PermissionsError. In that case, try using:
python3.7 -m pip install --user mario
or for more flexibility and safety, use pipx:
pipx install --python python3.7 mario
Mario addons
The mario-addons package provides a number of useful commands not found in the base collection.
Get Mario addons with pip:
python3.7 -m pip install mario-addons
If you’re not inside a virtualenv, you might get a PermissionsError. In that case, try using:
python3.7 -m pip install --user mario-addons
or for more flexibility and safety, use pipx:
pipx install --python python3.7 mario-addons
Usage
Basics
Invoke with mario at the command line.
$ mario eval 1+1
2
Use map to act on each item in the file with python commands:
$ mario map 'x.upper()' <<<'abc'
ABC
Chain python functions together with !:
$ mario map 'x.upper() ! len(x)' <<<hello
5
or by adding another command
$ mario map 'x.upper()' map 'len(x)' <<<hello
5
Use x as a placeholder for the input at each stage:
$ mario map ' x.split()[0] ! x.upper() + "!"' <<<'Hello world'
HELLO!
$ mario map 'x.split()[0] ! x.upper() + "!" ! x.replace("H", "J")' <<<'Hello world'
JELLO!
Automatically import modules you need:
$ mario stack 'itertools.repeat(x, 2) ! "".join' <<<hello,world!
hello,world!
hello,world!
Autocall
You don’t need to explicitly call the function with some_function(x); just use the function’s name some_function. For example, instead of
$ mario map 'len(x)' <<<'a\nbb'
5
try
$ mario map len <<<'a\nbb'
5
Commands
eval
Use eval to evaluate a Python expression.
$ mario eval 'datetime.datetime.utcnow()'
2019-01-01 01:23:45.562736
map
Use map to act on each input item.
$ mario map 'x * 2' <<<'a\nbb\n'
aa
bbbb
filter
Use filter to evaluate a condition on each line of input and exclude false values.
$ mario filter 'len(x) > 1' <<<'a\nbb\nccc\n'
bb
ccc
apply
Use apply to act on the sequence of items.
$ mario apply 'len(x)' <<<$'a\nbb'
2
stack
Use stack to treat the input as a single string, including newlines.
$ mario stack 'len(x)' <<<$'a\nbb'
5
reduce
Use reduce to evaluate a function of two arguments successively over a sequence, like functools.reduce.
For example, to multiply all the values together, first convert each value to int with map, then use reduce to successively multiply each item with the product.
$ mario map int reduce operator.mul <<EOF
1
2
3
4
EOF
24
chain
Use chain to flatten an iterable of iterables of items into an iterable of items, like itertools.chain.from_iterable.
For example, after calculating a several rows of items,
$ mario map 'x*2 ! [x[i:i+2] for i in range(len(x))]' <<<$'ab\nce'
['ab', 'ba', 'ab', 'b']
['ce', 'ec', 'ce', 'e']
use chain to put each item on its own row:
$ mario map 'x*2 ! [x[i:i+2] for i in range(len(x))]' chain <<<$'ab\nce'
ab
ba
ab
b
ce
ec
ce
e
Then subsequent commands will act on these new rows, as normal. Here we get the length of each row.
$ mario map 'x*2 ! [x[i:i+2] for i in range(len(x))]' chain map len <<<$'ab\nce'
2
2
2
1
2
2
2
1
Async
Making sequential requests is slow. These requests take 20 seconds to complete.
% time mario map 'requests.get ! x.text ! len' apply max <<EOF
http://httpbin.org/delay/5
http://httpbin.org/delay/1
http://httpbin.org/delay/4
http://httpbin.org/delay/3
http://httpbin.org/delay/4
EOF
302
0.61s user
0.06s system
19.612 total
Concurrent requests can go much faster. The same requests now take only 6 seconds. Use async-map, or async-filter, or reduce with await some_async_function to get concurrency out of the box.
% time mario async-map 'await asks.get ! x.text ! len' apply max <<EOF
http://httpbin.org/delay/5
http://httpbin.org/delay/1
http://httpbin.org/delay/4
http://httpbin.org/delay/3
http://httpbin.org/delay/4
EOF
297
0.57s user
0.08s system
5.897 total
Async streaming
async-map and async-filter values are handled in streaming fashion, while retaining the order of the input items in the output. The order of function calls is not constrained – if you need the function to be called with items in a specific order, use the synchronous version.
Making concurrent requests, each response is printed one at a time, as soon as (1) it is ready and (2) all of the preceding requests have already been handled.
For example, the 3 seconds item is ready before the preceding 4 seconds item, but it is held until the 4 seconds is ready because 4 seconds was started first, so the ordering of the input items is maintained in the output.
% time mario --exec-before 'import datetime; now=datetime.datetime.utcnow; START_TIME=now(); print("Elapsed time | Response size")' map 'await asks.get ! f"{(now() - START_TIME).seconds} seconds | {len(x.content)} bytes"' <<EOF
http://httpbin.org/delay/1
http://httpbin.org/delay/2
http://httpbin.org/delay/4
http://httpbin.org/delay/3
EOF
Elapsed time | Response size
1 seconds | 297 bytes
2 seconds | 297 bytes
4 seconds | 297 bytes
3 seconds | 297 bytes
Configuration basics
The config file location follows the freedesktop.org standard. Check the location on your system by running mario --help:
% mario --help
Usage: mario [OPTIONS] COMMAND1 [ARGS]... [COMMAND2 [ARGS]...]...
Mario: Python pipelines for your shell.
GitHub: https://github.com/python-mario/mario
Configuration:
Declarative config: /home/user/.config/mario/config.toml
Python modules: /home/user/.config/mario/modules/*.py
For example on Ubuntu we use ~/.config/mario/config.toml for declarative configuration. See Configuration Reference for the format specification.
# ~/.config/mario/config.toml
base_exec_before = """
from itertools import *
from collections import Counter
"""
Then you can directly use the imported objects without referencing the module.
% mario map 'Counter ! json.dumps' <<<$'hello\nworld'
{"h": 1, "e": 1, "l": 2, "o": 1}
{"w": 1, "o": 1, "r": 1, "l": 1, "d": 1}
You can set any of the mario options in your config. For example, to set a different default value for the concurrency maximum mario --max-concurrent, add max_concurrent to your config file (note the underscore):
# ~/.config/mario/config.toml
max_concurrent = 10
then just use mario as normal.
Custom commands
Define new commands in your config file which provide commands to other commands. For example, this config adds a jsonl command for reading jsonlines streams into Python objects, by calling calling out to the map traversal.
[[command]]
name = "jsonl"
help = "Load jsonlines into python objects."
[[command.stage]]
command = "map"
params = {code="json.loads"}
Now we can use it like a regular command:
% mario jsonl <<< $'{"a":1, "b":2}\n{"a": 5, "b":9}'
{'a': 1, 'b': 2}
{'a': 5, 'b': 9}
The new command jsonl can be used in pipelines as well. To get the maximum value in a sequence of jsonlines objects:
$ mario jsonl map 'x["a"]' apply max <<< $'{"a":1, "b":2}\n{"a": 5, "b":9}'
5
More command examples
Convert yaml to json
Convenient for removing trailing commas.
% mario yml2json <<<'{"x": 1,}'
{"x": 1}
[[command]]
name = "yml2json"
help = "Convert yaml to json"
[[command.stage]]
command = "stack"
params = {code="yaml.safe_load ! json.dumps"}
Search for xpath elements with xpath
Pull text out of xml documents.
% mario xpath '//' map 'x.text' <<EOF
<slide type="all">
<title>Overview</title>
<item>Anything <em>can be</em> in here</item>
<item>Or <em>also</em> in here</item>
</slide>
EOF
Overview
Anything
can be
Or
also
[[command]]
name="xpath"
help = "Find xml elements matching xpath query."
arguments = [{name="query", type="str"}]
inject_values=["query"]
[[command.stage]]
command = "stack"
params = {code="x.encode() ! io.BytesIO ! lxml.etree.parse ! x.findall(query) ! list" }
[[command.stage]]
command="chain"
Generate json objects
% mario jo 'name=Alice age=21 hobbies=["running"]'
{"name": "Alice", "age": 21, "hobbies": ["running"]}
[[command]]
name="jo"
help="Make json objects"
arguments=[{name="pairs", type="str"}]
inject_values=["pairs"]
[[command.stage]]
command = "eval"
params = {code="pairs"}
[[command.stage]]
command = "map"
params = {code="shlex.split(x, posix=False)"}
[[command.stage]]
command = "chain"
[[command.stage]]
command = "map"
params = {code="x.partition('=') ! [x[0], ast.literal_eval(re.sub(r'^(?P<value>[A-Za-z]+)$', r'\"\\g<value>\"', x[2]))]"}
[[command.stage]]
command = "apply"
params = {"code"="dict"}
[[command.stage]]
command = "map"
params = {code="json.dumps"}
Read csv file
Read a csv file into Python dicts. Given a csv like this:
% cat names.csv
name,age
Alice,21
Bob,25
try:
% mario csv < names.csv
{'name': 'Alice', 'age': '21'}
{'name': 'Bob', 'age': '25'}
base_exec_before = '''
import csv
import typing as t
def read_csv(
file, header: bool, **kwargs
) -> t.Iterable[t.Dict[t.Union[str, int], str]]:
"Read csv rows into an iterable of dicts."
rows = list(file)
first_row = next(csv.reader(rows))
if header:
fieldnames = first_row
reader = csv.DictReader(rows, fieldnames=fieldnames, **kwargs)
return list(reader)[1:]
fieldnames = range(len(first_row))
return csv.DictReader(rows, fieldnames=fieldnames, **kwargs)
'''
[[command]]
name = "csv"
help = "Load csv rows into python dicts. With --no-header, keys will be numbered from 0."
inject_values=["delimiter", "header"]
[[command.options]]
name = "--delimiter"
default = ","
help = "field delimiter character"
[[command.options]]
name = "--header/--no-header"
default=true
help = "Treat the first row as a header?"
[[command.stage]]
command = "apply"
params = {code="read_csv(x, header=header, delimiter=delimiter)"}
[[command.stage]]
command = "chain"
[[command.stage]]
command = "map"
params = {code="dict(x)"}
Plugins
Add new commands like map and reduce by installing Mario plugins. You can try them out without installing by adding them to any .py file in your ~/.config/mario/modules/.
Share popular commands by installing the mario-addons package.
Q & A
What’s the status of this package?
Check the issues page for open tickets.
This package is experimental and is subject to change without notice.
Why another package?
A number of cool projects have pioneered in the Python-in-shell space. I wrote Mario because I didn’t know these existed at the time, but now Mario has a bunch of features the others don’t (user configuration, multi-stage pipelines, async, plugins, etc).
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.