This is a pre-production deployment of Warehouse, however changes made here WILL affect the production instance of PyPI.
Latest Version Dependencies status unknown Test status unknown Test coverage unknown
Project Description

Author: Pahaz Blinov

Repo: https://github.com/pahaz/py3line/

Pyline is a UNIX command-line tool for line-based processing in Python with regex and output transform features similar to grep, sed, and awk.

This project inspired by: piep, pysed, pyline, pyp and Jacob+Mark recipe

requirements: Python3

WHY I MAKE IT?

I sometimes have to use sed / awk. Not often, and so I always forget the necessary options and sed / awk DSL. But I now python, I like it, and I want use it for data processing. Default python -c is hard to write the kind of one-liner that works well.

Why not a pyline?
  • Don`t support python3
  • Have many options (I want as much simple as possible solution)
  • Bad performance
  • Don`t support command chaining
Why not a pysed?

Installation

py3line is on PyPI, so simply run:

pip install py3line

or

easy_install py3line

to have it installed in your environment.

For installing from source, clone the repo and run:

python setup.py install

Tutorial

Lets start with two simple examples:

$ echo -e "Here are\nsome\nwords for you." | ./py3line.py "x.split()" -a "len(x)"
2
1
3

$ echo -e "Here are\nsome\nwords for you." | ./py3line.py "x.split()" -a "len(x)" -a "sum(xx)"
6

How it works?

Py3line produces a transform over the input data stream. Py3line transform is constructed from a sequence of python actions. Each action can be an action over an element of stream or an action over the stream.

First example overview

   echo -e "Here are\nsome\nwords for you." | ./py3line.py "x.split()" -a "len(x)"

* **echo -e "Here are\nsome\nwords for you."** -- create an input stream data consists of three lines
* **|** -- pipeline input stream to py3line
* **"x.split()" -a "len(x)"** -- define two actions: "x.split()" and "len(x)". Each of them is element based action

Py3line expects to get at least one transformation action as positional argument. You also can define additional action by using -a arguments, as shown in the example above.

The example above can be represented as the following python pseudo-code:

import sys

for x in sys.stdin.readlines():

    # 1) action "x.split()"
    x = x.split()

    # 2) action "len(x)"
    x = len(x)

    print(x)

Second example overview

echo -e "Here are\nsome\nwords for you." | ./py3line.py "x.split()" -a "len(x)" -a "sum(xx)"

Here we have stream based action “sum(xx)”.

It can be represented as python pseudo-code:

import sys

xx = [x for x in sys.stdin.readlines()]

for x in xx:

    # 1) action "x.split()"
    x = x.split()

    # 2) action "len(x)"
    x = len(x)

# 3) action "sum(xx)"
print(sum(xx))

What is order actions?

This commands are equal:

./py3line.py "x.split()" -a "len(x)" -a "sum(xx)"
./py3line.py -a "x.split()" "len(x)" -a "sum(xx)"
./py3line.py -a "x.split()" -a "len(x)" "sum(xx)"

But we recommend use:

./py3line.py "x.split()" -a "len(x)" -a "sum(xx)"

as the right actions ordering.

Why it so? Because you must pass one action as positional argument.

Actions chaining

Let us define some terminology. py3line action1 -a action2 -a action3

We have actions: action1, action2 and action3. Each of them may be element based or stream based.

Element based action can be represented as python pseudo-code:

xx = ...
new_xx = []

for x in xx:
    # DO ELEMENT BASED ACTION ON `x`
    result = eval(compile(action_x, ..., 'eval'), {'x': x})
    new_xx.append(result)

xx = new_xx

Stream based action can be represented as python pseudo-code:

xx = ...

# DO STREAM BASED ACTION ON `xx`
xx = eval(compile(action_xx, ..., 'eval'), {'xx': xx})

Pre-actions

Sometimes you want prepare some variables or import some modules.

You can use -m options for import module:

./py3line.py -m shlex "shlex.split(x)[13]"

You also can use -p options for run exec some actions before processing:

./py3line.py -p "rgx = re.compile(r' is ([A-Z]\w*)')" "rgx.search(x).group(1)"

Pseudo code example ./py3line.py -m module1 -m module2 -p pre-action1 -p pre-action2 …

import module1
import module2

pre-action1
pre-action2

...

Options ordering

Regardless of the sequence definition. First be made all imports (-m option), then be made all pre-action (-p option), and then actions (-a option + 1st positional argument).

# Print every line (null transform)
$ cat ./testsuit/test.txt | ./py3line.py x
This is my cat,
 whose name is Betty.
This is my dog,
 whose name is Frank.
This is my fish,
 whose name is George.
This is my goat,
 whose name is Adam.

# Number every line
$ cat ./testsuit/test.txt | ./py3line.py "i, x"
0 This is my cat,
1  whose name is Betty.
2 This is my dog,
3  whose name is Frank.
4 This is my fish,
5  whose name is George.
6 This is my goat,
7  whose name is Adam.

# Print every first and last word
$ cat ./testsuit/test.txt | ./py3line.py "x.split()[0], x.split()[-1]"
This cat,
whose Betty.
This dog,
whose Frank.
This fish,
whose George.
This goat,
whose Adam.

# Split into words and print (strip al non word char like comma, dot, etc)
$ cat ./testsuit/test.txt | ./py3line.py "re.findall(r'\w+', x)"
This is my cat
whose name is Betty
This is my dog
whose name is Frank
This is my fish
whose name is George
This is my goat
whose name is Adam

# Regex matching with groups
$ cat ./testsuit/test.txt | ./py3line.py "re.findall(r' is ([A-Z]\w*)', x) or False"
Betty
Frank
George
Adam

# cat ./testsuit/test.txt | ./py3line.py "re.search(r' is ([A-Z]\w*)', x).group(1)"
$ cat ./testsuit/test.txt | ./py3line.py -p "rgx = re.compile(r' is ([A-Z]\w*)')" "rgx.search(x).group(1)"
Betty
Frank
George
Adam

## Original Examples
# Print out the first 20 characters of every line
# cat ./testsuit/test.txt | ./py3line.py "i < 2"
$ cat ./testsuit/test.txt | ./py3line.py "list(xx)[:2]"
This is my cat,
 whose name is Betty.

# Print just the URLs in the access log
$ cat ./testsuit/nginx.log | ./py3line.py -m shlex "shlex.split(x)[13]"
HEAD / HTTP/1.0
HEAD / HTTP/1.0
HEAD / HTTP/1.0
HEAD / HTTP/1.0
HEAD / HTTP/1.0
GET /admin/moktoring/session/add/ HTTP/1.1
GET /admin/jsi18n/ HTTP/1.1
GET /static/admin/img/icon-calendar.svg HTTP/1.1
GET /static/admin/img/icon-clock.svg HTTP/1.1
HEAD / HTTP/1.0
HEAD / HTTP/1.0
HEAD / HTTP/1.0
HEAD / HTTP/1.0
HEAD / HTTP/1.0
GET /logout/?reason=startApplication HTTP/1.1
GET / HTTP/1.1
GET /login/?next=/ HTTP/1.1
POST /admin/customauth/user/?q=%D0%9F%D0%B0%D1%81%D0%B5%D1%87%D0%BD%D0%B8%D0%BA HTTP/1.1

# Print most common accessed urls and filter accessed more then 5 times
$ cat ./testsuit/nginx.log | ./py3line.py -m shlex -m collections  -a "shlex.split(x)[13]" -a "collections.Counter(xx).most_common()" "x[1] > 5 and x[0]"
HEAD / HTTP/1.0

HELP

usage: py3line.py [-h] [-a action] [-p pre_action] [-o OUTPUT] [-i]
                  [--in-place-suffix IS_INPLACE_SUFFIX] [-m MODULES] [-v] [-q]
                  [--version]
                  action [file [file ...]]

Py3line is a UNIX command-line tool for line-based processing in Python with
regex and output transform features similar to grep, sed, and awk.

positional arguments:
  action                <python_expression>
  file                  Input file #default: stdin

optional arguments:
  -h, --help            show this help message and exit
  -a action, --action action
                        <python_expression>
  -p pre_action, --pre-action pre_action
                        <python_expression>
  -o OUTPUT, --out OUTPUT, --output-file OUTPUT
                        Output file #default: '-' for stdout
  -i, --in-place        Output to editable file
  --in-place-suffix IS_INPLACE_SUFFIX
                        Output to editable file and provide a backup suffix
                        for keeping a copy of the original file
  -m MODULES, --modules MODULES
                        for m in modules: import m #default: []
  -v, --verbose
  -q, --quiet
  --version             Print the version string
Release History

Release History

0.0.3

This version

History Node

TODO: Figure out how to actually get changelog content.

Changelog content for this version goes here.

Donec et mollis dolor. Praesent et diam eget libero egestas mattis sit amet vitae augue. Nam tincidunt congue enim, ut porta lorem lacinia consectetur. Donec ut libero sed arcu vehicula ultricies a non tortor. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Show More

0.0.2

History Node

TODO: Figure out how to actually get changelog content.

Changelog content for this version goes here.

Donec et mollis dolor. Praesent et diam eget libero egestas mattis sit amet vitae augue. Nam tincidunt congue enim, ut porta lorem lacinia consectetur. Donec ut libero sed arcu vehicula ultricies a non tortor. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Show More

0.0.1

History Node

TODO: Figure out how to actually get changelog content.

Changelog content for this version goes here.

Donec et mollis dolor. Praesent et diam eget libero egestas mattis sit amet vitae augue. Nam tincidunt congue enim, ut porta lorem lacinia consectetur. Donec ut libero sed arcu vehicula ultricies a non tortor. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Show More

Download Files

Download Files

TODO: Brief introduction on what you do with files - including link to relevant help section.

File Name & Checksum SHA256 Checksum Help Version File Type Upload Date
py3line-0.0.3.tar.gz (9.4 kB) Copy SHA256 Checksum SHA256 Source Sep 25, 2016

Supported By

WebFaction WebFaction Technical Writing Elastic Elastic Search Pingdom Pingdom Monitoring Dyn Dyn DNS HPE HPE Development Sentry Sentry Error Logging CloudAMQP CloudAMQP RabbitMQ Heroku Heroku PaaS Kabu Creative Kabu Creative UX & Design Fastly Fastly CDN DigiCert DigiCert EV Certificate Rackspace Rackspace Cloud Servers DreamHost DreamHost Log Hosting