Skip to main content
This is a pre-production deployment of Warehouse. Changes made here affect the production instance of PyPI (
Help us improve Python packaging - Donate today!

Placeholder description

Project Description
pawk is a python-based replacement for awk.

It uses python for line-by-line processing of files


#pawk automatically reads lines as csv rows and stores the result as a list in "r"
#-g ("grep") keeps a subset of lines satisfying a given condition

#Selects lines from input.txt with at least 3 csv fields
> pawk -f input.txt -g 'len(r) > 2'

#Keep a subset of lines where the second csv field is non-empty
> pawk -f input.txt -g 'r[1]'

#The above may crash if some lines have only one csv field
#Use this instead:
> pawk -f input.txt -g 'len(r) > 1 and r[1]'

#The raw line is stored in the "l" variable
#Keep a subset of lines where l isn't empty and the first character is "a"
> pawk -f input.txt -g 'l != "" and l[0] == "a"'

#Run certain code for each input line using -p
#Using -p prevents the default printing of the line

#For each line of the input, print that line with whitespace stripped
> pawk -f input.txt -p 'print l.strip()'

#default value of -f is /dev/stdin
> less input.txt | pawk -p 'print len(r)'

#-d sets the input delimiter
#the output delimiter is ",", so this command converts a tsv to a csv
> pawk -f input.txt -d '\t'

#pawk store the line number (zero-indexed) in the "i" variable
#only keep lines starting with the 1133rd
> pawk -f input.txt -g 'i>=1132'

#replace a regular expression from each line (python re module imported by default)
> pawk -f input.txt -p 'print re.sub("U_C_Rate","firearm_rate",l)'

#-b runs code before any lines are processed
#-e runs code after all lines are processed
#To add up a list of floats
> pawk -f input.txt -b "cnt=0" -p "cnt += float(l)" -e "print cnt"

Writing multi-line python in pawk:
Heavily inspired by a source I can't find right now, pawk can process strings representing multi-line python.

#(semi-colon) or (colon+whitespace) causes a line break
'import random; print(random.random())'
import random;

#after lines with (colon+whitespace) successive lines are automatically indented:
'if i>3: print("hello world!"); a += 1; b = 0'
if i>3:
print("hello world!");
a += 1;
b == 0

#use the 'end;' keyword to force indent level to decrease (compare this example with the above)
'if i>3: print("hello world!"); end; a += 1; b = 0'
if i>3:
print("hello world!");
a += 1;
b = 0

#"elif:", "else:" and "except:" automatically cause indenting to decrease
'if i>3: print("a"); elif i>1: print("b"); else: print("c")'
if i>3:
elif i>1:

#you can define functions!
'def test123(): print("hello world!"); end; test123(); test123(); test123();'
def test123():
print("hello world!");
Release History

Release History

This version
History Node


History Node


History Node


History Node


Download Files

Download Files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

File Name & Checksum SHA256 Checksum Help Version File Type Upload Date
python-awk-0.0.4.tar.gz (5.0 kB) Copy SHA256 Checksum SHA256 Source Jun 25, 2017

Supported By

WebFaction WebFaction Technical Writing Elastic Elastic Search Pingdom Pingdom Monitoring Dyn Dyn DNS Sentry Sentry Error Logging CloudAMQP CloudAMQP RabbitMQ Heroku Heroku PaaS Kabu Creative Kabu Creative UX & Design Fastly Fastly CDN DigiCert DigiCert EV Certificate Rackspace Rackspace Cloud Servers DreamHost DreamHost Log Hosting