aggnf: Aggregate Nth Field. A small console utility to count/group text data.
Project description
aggnf: Aggregate Nth Field. A small console utility to count/group text data.
Free software: MIT license
Documentation: (COMING SOON!) https://aggnf.readthedocs.org.
Features
Generates aggregate counts of text data, using a specified field as a key.
Fields can be delimited by any string, the default is consecutive whitespace.
Key field can be any integer, with negative integers counting backwards. The default is the last field.
How-To
The --help option is descriptive:
~$ aggnf --help Usage: aggnf [OPTIONS] [IN_DATA] Group text data based on a Nth field, and print the aggregate result. Works like SQL: `select field, count(*) from tbl group by field` Or shell: `cat file | awk '{print $NF}' | sort | uniq -c` Arguments: IN_DATA Input file, if blank, STDIN will be used. Options: -d, --sep TEXT Field delimiter. Defaults to whitespace. -n, --fieldnum INTEGER The field to use as the key, default: last field. -o, --sort Sort result. -i, --ignore-err Don't exit if field is specified and out of range. --help Show this message and exit.
Here we generate an example file of 1000 random numbers, and ask aggnf to group it for us, ordering the result by the most common occurrences:
~$ seq 1 1000 | while read -r l; do echo -e "line:${l}\t${RANDOM:0:1}"; done > rand.txt ~$ aggnf -o rand.txt 1: 340 2: 336 3: 120 8: 42 6: 37 5: 35 7: 35 4: 33 9: 22
This might look familiar, as it’s the same result one might get from something like select field,count(*) as count from table group by field order by count desc, or even by the following bash one-liner:
~$ cat rand.txt | awk '{print $NF}' | sort | uniq -c | sort -nr 340 1 336 2 120 3 42 8 37 6 35 7 35 5 33 4 22 9
To-Do
Output is mangled when using another delimiter, will fix.
Add a --sum option, which will key on one field, and sum the contents of another.
Speed optimizations.
Notes
The usefulness of this program is questionable. It’s functionality is already covered by existing console commands that are much faster.
This project is merely a quick example to learn the basics of packages which are unfamiliar to me, namely: cookiecutter, tox, and click.
History
April 4th: Released
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file aggnf-0.2.3.tar.gz
.
File metadata
- Download URL: aggnf-0.2.3.tar.gz
- Upload date:
- Size: 31.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c33ed7245cc664c484622eab8c4b6bc9b73d93484d30f236595effe1f9a1a053 |
|
MD5 | e70ad8a1d11a06fd9c79343b2d35b145 |
|
BLAKE2b-256 | c468668f53718995c42da0bee788edbf4c544bc5a2b788a743607f720e92d96a |
File details
Details for the file aggnf-0.2.3-py2.py3-none-any.whl
.
File metadata
- Download URL: aggnf-0.2.3-py2.py3-none-any.whl
- Upload date:
- Size: 4.7 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 76318183bac18419d33ad04a3dcc54dd10be569d1041ebbb80443dd6a0d47d1c |
|
MD5 | 6ec655037ad146976fcf490c4aca7e19 |
|
BLAKE2b-256 | fcc22c93046b7936f57a41bd056466bb8760b3da52866bf6e344175e78664764 |