Skip to main content

Some CLI tools for text file processing

Project description

sodatools

A CLI toolkit for columnar text processing. Pipe-friendly, composable tools that operate on whitespace-delimited data from stdin.

pip install sodatools-core

Every tool is available both as a subcommand (soda <tool>) and as a direct sd<tool> entry point (e.g. sdalign, sdkut, sduniq). The sd prefix keeps the direct names out of the way of coreutils.

Tools

Tool Description
align Buffer and align columns to fixed widths (auto right-justify numbers)
coltest Filter lines by column test expressions (2gt50, 1seqOK, 3m^abc)
cutw Truncate lines to terminal or explicit width
delta Compute row-to-row differences for numeric columns
events Convert per-row state labels into bounded events with start/stop times
filter Filter noisy or invalid rows by column behavior (MAD, run-length, rate, range, clock, monotonicity)
kut Select, reorder, and format columns (ranges, exclusions, literals)
nf Enforce a specific number of fields per line (drop, pad, or truncate)
plot Interactive scatter plots from columnar data (matplotlib)
radix Convert integer columns between bases (dec, hex, bin, oct)
sample Sample rows (every N, random, percent) or resample by time interval
sample-data Generate sample datasets for testing (weather, small, noisy, etc.)
smooth Sliding window mean/median smoothing for numeric columns
stats Column statistics with formatted table output (mean, std, trend, outliers)
tag Assign named states to rows based on column conditions
uniq Consecutive deduplication with optional column keys and fuzzy matching

Examples

Tag temperature readings into states and convert to events:

soda sample-data weather | soda tag -H1 hot: 2gt25 cold: 2lt15 normal: 2ge15 2le25 \
  | soda events -H1 --gap 5m --print-header
state  start                stop                 duration  count
hot    2024-01-01T04:35:00  2024-01-01T10:55:00  6.3h      77
normal 2024-01-01T11:00:00  2024-01-01T18:00:00  7.0h      85
cold   2024-01-01T18:05:00  2024-01-02T04:30:00  10.4h     126
...

Select columns, compute deltas, and align:

soda sample-data weather | soda kut 1 2 3 -H1 | soda delta 2 3 -H1 | soda align -H1
timestamp            temp  humidity
-------------------  ----  --------
2024-01-01T00:05:00  0.03     -0.29
2024-01-01T00:10:00  0.12      0.51
2024-01-01T00:15:00  0.18     -0.87
...

Clean noisy sensor data — remove MAD outliers and short mode glitches, respecting time gaps:

soda sample-data noisy | soda filter 2:mad 3:run --gap 30s --drop | soda align

Sample every 10th row, compute statistics:

soda sample-data weather | soda sample every 10 -H1 | soda stats -H1

Convert hex register dumps to binary with aligned output:

echo -e "0xff\n0x0a\n0x80" | soda radix --from hex --to bin -b 8 --align
11111111
00001010
10000000

Filter lines where column 2 > 50, show only matching lines:

soda sample-data small | soda coltest 2gt50 -H1

Resample time series to 1-hour intervals, then plot temperature and humidity:

soda sample-data weather | soda sample interval 1h -H1 | soda plot -H1 2 c=red / 3 c=blue

Enforce 3 fields per line, pad short rows, smooth, and align:

soda sample-data small | soda nf 3 -H1 --pad '?' | soda smooth 3 -p 1 | soda align

Common flags

Most tools share these flags:

Flag Alias Description
--delimiter -d Input field delimiter (default: whitespace)
--header -H First N lines are headers
--align Adaptively align output columns
--no-fail Skip problematic rows instead of erroring
--example Show usage examples and exit

Plugins

Third-party packages can add new subcommands to soda by registering them under the sodatools.commands entry-point group. Core discovers plugins at startup; any failures are reported to stderr without crashing the CLI.

# your plugin's pyproject.toml
[project.entry-points."sodatools.commands"]
mytool = "sodatools_myplugin.cli:mytool"   # exposes `soda mytool`

[project.scripts]
sdmytool = "sodatools_myplugin.entrypoints:sdmytool"  # optional direct entry

The registered object must be a typer-compatible command function with the same shape as the built-ins (plain function with typer.Option / typer.Argument defaults). Plugin names that collide with built-ins are skipped with a warning.

Changelog

See CHANGELOG.md for release notes.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sodatools_core-0.0.3.tar.gz (113.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sodatools_core-0.0.3-py3-none-any.whl (69.0 kB view details)

Uploaded Python 3

File details

Details for the file sodatools_core-0.0.3.tar.gz.

File metadata

  • Download URL: sodatools_core-0.0.3.tar.gz
  • Upload date:
  • Size: 113.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.6 {"installer":{"name":"uv","version":"0.11.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for sodatools_core-0.0.3.tar.gz
Algorithm Hash digest
SHA256 ae8ab07b290669dddd94e144187b3d635dfe5d19f33a29dfe683aa1aef0406e7
MD5 954b686a884e8b9c6c83e943301e4889
BLAKE2b-256 252c63271bc963ed38433a377dca564315c4e8dfdbea3ba94ff2f2f4d49a0349

See more details on using hashes here.

File details

Details for the file sodatools_core-0.0.3-py3-none-any.whl.

File metadata

  • Download URL: sodatools_core-0.0.3-py3-none-any.whl
  • Upload date:
  • Size: 69.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.6 {"installer":{"name":"uv","version":"0.11.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for sodatools_core-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 93d3f1323377762dd7a21c44abd106af0f5bbcef91421250de6ea7ae240a10eb
MD5 2723cd787f360f1d9265fac895e25a59
BLAKE2b-256 502b6fe6f902a42c09c83eed70d8bd3a11a1773e8ad9fa28f78e6e8b046db8ad

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page