Skip to main content

Run cmd for every line of input

Project description

https://img.shields.io/travis/jwgalley/cmdfor.svg https://img.shields.io/pypi/v/cmdfor.svg

nd for every line of input

Features

A shell utility (and package) which runs a command for every line of input.

It allows for spawning an arbitrary number of concurrent threads, and control over where to keep each commands output.

In my daily work, I have to run all manner of commands on huge batches of items. These things are usually not CPU bound, so it makes sense to multithread these tasks.

Thus, I find myself doing bash commands such as the following, which takes an input file of items, splits it into equal(ish) parts, and then spawns a worker for each part, all the while keeping granular logs and return codes:

lines=`wc -l domains.txt | awk '{print $1}'`; threads=10; split=$(((lines/threads)+1)); mkdir -p in out; split -d -l ${split} domains.txt in/part. ; ls in/ | while read -r f; do cat in/${f} | while read -r d; do host -t a "${d}" > out/${d} 2>&1; echo -e "${d}\t$?"; done > log.${f} & echo ${!}; done > pids

That gets pretty tiring to type all the time. Why not use xargs -P you say? Well that works perfectly fine for cases where I don’t need to make very complicated commands, and don’t need to log all return codes. Maybe I can do all of that with xargs, but I wanted to make this anyway as a learning experience.

How-To

The program can take input from STDIN or from a file passed with the -i option.

All arguments that aren’t options are considered the subcommand to run. All wildcards {} are replaced with the corresponding positional field from the input data.

To delete a list of files, basically the same behaviour as xargs:

cat files.txt | cmdfor rm {}

To run the fictional command imaplogin for every line of a csv that contains <email>,<password> fields, logging each individual command’s output to an file in the directory ./out:

cat email_users.csv | cmdfor -d, -o ./out -- imaplogin -u {} -p {}

To look up the IP addresses of a huge amount of hostnames, using 10 concurrent threads, and storing each individual commands stdout and stderr in seperate files in the directory ./results, with each file being named after the hostname on which the query was performed:

cat hostnames.txt | cmdfor -t 10 -Eo ./results -l 1 -- host -t a {}

To-Do

1. Come up with a real test case. Since this is a shell utility and really only deals with shell subcommands, I don’t know what will work and what won’t on travis.ci (can I run a shell command there?) 2. By default, it suppresses all output from subprocesses, and writes a message to STDOUT for each process spawn and reap. This output is too verbose for the default behaviour, and so it should be toggled with -v. The default should be quitier and simpler. Perhaps just the returncodes of each task. 3. Refactoring some stuff to be a little less messy. The function signatures are huge, and there are messages generated in odd places. I think it would be better to pass a context object.

History

2018-04-09 v0.1.0 initial release, still need to do tests and docs 2018-04-10 v0.1.5 fixing import issues and removing <2.7 support

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cmdfor-0.1.5.tar.gz (33.5 kB view details)

Uploaded Source

Built Distribution

cmdfor-0.1.5-py2.py3-none-any.whl (10.4 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file cmdfor-0.1.5.tar.gz.

File metadata

  • Download URL: cmdfor-0.1.5.tar.gz
  • Upload date:
  • Size: 33.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for cmdfor-0.1.5.tar.gz
Algorithm Hash digest
SHA256 cd73631799e73e30513c046337e3d401be7bd0488f96332e21b77f5738ab0b9c
MD5 54a9e158a4cdd828b5c3a434c60ef9d0
BLAKE2b-256 e5a6c5a6d066fbfb0dcaca385d7041a33f8a84383ab8f542d9b672036223003c

See more details on using hashes here.

File details

Details for the file cmdfor-0.1.5-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for cmdfor-0.1.5-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 c25c8dcce2a15983baaa2179c51f3b07226c9830eef4c940788cf7ef12a76ab5
MD5 b193058b8987e90d47e27bdc052388f3
BLAKE2b-256 e4560098400caed87faf87d23be67269d4c4e4a00d1a2a193c7ab91d32c76318

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page