Skip to main content

Unix Pipe Fittings For Data Science

Project description

UNIX pipe fittings for statistics

In the quest for command line data science, this kit contains three command line utilities intended to be used in UNIX pipes.

All three process STDIN to STDOUT output their docstrings if run without parameters.

Python 3 is required.

sd_c (smalldata count)

Is a regular expression counter filter, contained in smalldata/ Please see docstring for further help.

sd_g (smalldata groupby)

Concatenates lines from stdin that match a regular expression, contained in smalldata/ Please see docstring.

sd_e (smalldata extract)

In the spirit of RegExSerDe, this tool uses regular expressions to generate a CSV file from a free-form text file. It is contained in smalldata/ and has a docstring.

Other Useful Tools

If you've got CSV files, you should definitively check out q.

To Do

A cookbook would be nice. Showing how to analyze log files etc.


Used to live in a gist:

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for smalldata, version 0.0.3
Filename, size File type Python version Upload date Hashes
Filename, size smalldata-0.0.3-py3-none-any.whl (9.1 kB) File type Wheel Python version py3 Upload date Hashes View
Filename, size smalldata-0.0.3.tar.gz (6.5 kB) File type Source Python version None Upload date Hashes View

Supported by

Pingdom Pingdom Monitoring Google Google Object Storage and Download Analytics Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page